Age | Commit message (Collapse) | Author |
|
Recoverable page faults are represented by the xnack mode setting inside
a kfd process and are used to represent the device page faults. For CR,
we don't consider negative values which are typically used for querying
the current xnack mode without modifying it.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When doing a restore on a different node, the gpu_id's on the restore
node may be different. But the user space application will still refer
use the original gpu_id's in the ioctl calls. Adding code to create a
gpu id mapping so that kfd can determine actual gpu_id during the user
ioctl's.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add support to existing CRIU ioctl's to save and restore events during
criu checkpoint and restore.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Checkpoint contents of queue control stacks on CRIU dump and restore them
during CRIU restore.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Checkpoint contents of queue MQD's on CRIU dump and restore them during
CRIU restore.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When re-creating queues during CRIU restore, restore the queue with the
same queue id value used during CRIU dump.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add support to existing CRIU ioctl's to save number of queues and queue
properties for each queue during checkpoint and re-create queues on
restore.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Introducing UNPAUSE op. After CRIU amdgpu plugin performs a PROCESS_INFO
op the queues will be stay in an evicted state. Once the plugin is done
draining BO contents, it is safe to perform an UNPAUSE op for the queues
to resume.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This adds support to create userptr BOs on restore and introduces a new
ioctl op to restart memory notifiers for the restored userptr BOs.
When doing CRIU restore MMU notifications can happen anytime after we call
amdgpu_mn_register. Prevent MMU notifications until we reach stage-4 of the
restore process i.e. criu_resume ioctl op is received, and the process is
ready to be resumed. This ioctl is different from other KFD CRIU ioctls
since its called by CRIU master restore process for all the target
processes being resumed by CRIU.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This adds support to discover the buffer objects that belong to a
process being checkpointed. The data corresponding to these buffer
objects is returned to user space plugin running under criu master
context which then stores this info to recreate these buffer objects
during a restore operation.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Checkpoint-Restore in userspace (CRIU) is a powerful tool that can
snapshot a running process and later restore it on same or a remote
machine but expects the processes that have a device file (e.g. GPU)
associated with them, provide necessary driver support to assist CRIU
and its extensible plugin interface. Thus, In order to support the
Checkpoint-Restore of any ROCm process, the AMD Radeon Open Compute
Kernel driver, needs to provide a set of new APIs that provide
necessary VRAM metadata and its contents to a userspace component
(CRIU plugin) that can store it in form of image files.
This introduces some new ioctls which will be used to checkpoint-Restore
any KFD bound user process. KFD only allows ioctl calls from the same
process that opened the KFD file descriptor. Since these ioctls are
expected to be called from a KFD criu plugin which has elevated ptrace
attached privileges and CAP_CHECKPOINT_RESTORE capabilities attached with
the file descriptors so modify KFD to allow such calls.
(API redesigned by David Yat Sin)
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
CP supports unmap queue with reset mode which only destroys specific queue without affecting others.
Replacing whole gpu reset with reset queue mode for RAS poison consumption
saves much time, and we can also fallback to gpu reset solution if reset
queue fails.
v2: Return directly if process is NULL;
Reset queue solution is not applicable to SDMA, fallback to legacy
way;
Call kfd_unref_process after lookup process.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Initializes kfd->device_info given either asic_type (enum) if GFX
version is less than GFX9, or GC IP version if greater. Also takes in vf
and the target compiler gfx version. Uses SDMA version to determine
num_sdma_queues_per_engine.
Convert device_info to a non-pointer member of kfd, change references
accordingly.
Change unsupported asic condition to only probe f2g, move device_info
initialization post-switch.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
device_info->asic_name and amdgpu_asic_name[adev->asic_type] both
provide asic name strings, with the only difference being casing.
Remove asic_name from device_info and replace sysfs entry with lowercase
amdgpu_asic_name[]. Ensures string is null-terminated so that this
doesn't break if dev->node_props.name ever gets set anywhere else.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
unmap range always increase atomic svms->drain_pagefaults to simplify
both parent range and child range unmap, page fault handle ignores the
retry fault if svms->drain_pagefaults is set to speed up interrupt
handling. svm_range_drain_retry_fault restart draining if another
range unmap from cpu.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove unused entries in kfd_device_info table: num_xgmi_sdma_engines
and num_sdma_queues_per_engine. They are calculated in
kfd_get_num_sdma_engines and kfd_get_num_xgmi_sdma_engines instead.
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Graham Sider <Graham.Sider@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Instead of hard coding the number of sdma engines and the number of
sdma_xgmi engines in the device_info table, get the number of toal SDMA
instances from amdgpu. The first two engines are sdma engines and the
rest are sdma-xgmi engines unless the ASIC doesn't support XGMI.
v2: add kfd_ prefix to non static function names
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
asic_family was a duplicate of asic_type, both of type amd_asic_type.
Replace all instances of device_info->asic_family with adev->asic_type
and remove asic_family from device_info.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Defined as GC HWIP >= IP_VERSION(9, 0, 1).
Also defines KFD_GC_VERSION to return GC HWIP version.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Completes removal of kgd_dev. Direct references to amdgpu_device objects
should now be used instead.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove get_amdgpu_device and other remaining kgd_dev references aside
from declaration/kfd struct entry and initialization.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Modified definitions:
- amdgpu_amdkfd_get_fw_version
- amdgpu_amdkfd_get_local_mem_info
- amdgpu_amdkfd_get_gpu_clock_counter
- amdgpu_amdkfd_get_max_engine_clock_in_mhz
- amdgpu_amdkfd_get_cu_info
- amdgpu_amdkfd_get_dmabuf_info
- amdgpu_amdkfd_get_vram_usage
- amdgpu_amdkfd_get_hive_id
- amdgpu_amdkfd_get_unique_id
- amdgpu_amdkfd_get_mmio_remap_phys_addr
- amdgpu_amdkfd_get_num_gws
- amdgpu_amdkfd_get_asic_rev_id
- amdgpu_amdkfd_get_noretry
- amdgpu_amdkfd_get_xgmi_hops_count
- amdgpu_amdkfd_get_xgmi_bandwidth_mbytes
- amdgpu_amdkfd_get_pcie_bandwidth_mbytes
Also replaces kfd_device_by_kgd with kfd_device_by_adev, now
searching via adev rather than kgd.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Patch series to remove kgd_dev struct and replace all instances with
amdgpu_device objects.
amdgpu_device needs to be declared in kgd_kfd_interface.h to be visible
to kfd2kgd_calls.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The check for whether to drain retry faults must be under the mmap write
lock to serialize with munmap notifier callbacks.
We were also missing checks on child ranges. To fix that, simplify the
logic by using a flag rather than checking on each prange. That also
allows draining less freqeuntly when many ranges are unmapped at once.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Alex Sierra <Alex.Sierra@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]:
When we call hmm_range_fault to map memory after a migration, we don't
expect memory to be migrated again as a result of hmm_range_fault. The
driver ensures that all memory is in GPU-accessible locations so that
no migration should be needed. However, there is one corner case where
hmm_range_fault can unexpectedly cause a migration from DEVICE_PRIVATE
back to system memory due to a write-fault when a system memory page in
the same range was mapped read-only (e.g. COW). Ranges with individual
pages in different locations are usually the result of failed page
migrations (e.g. page lock contention). The unexpected migration back
to system memory causes a deadlock from recursive locking in our
driver.
[How]:
Creating a task reference new member under svm_range_list struct.
Setting this with "current" reference, right before the hmm_range_fault
is called. This member is checked against "current" reference at
svm_migrate_to_ram callback function. If equal, the migration will be
ignored.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Actually, cu_mask has been copied to mqd memory and
does't have to persist in queue_properties. Remove it
from queue_properties.
And use struct mqd_update_info to store such properties,
then pass it to update queue operation.
v2:
* Rename pqm_update_queue to pqm_update_queue_properties.
* Rename struct queue_update_info to struct mqd_update_info.
* Rename pqm_set_cu_mask to pqm_update_mqd.
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Currently, queue is updated with data in queue_properties.
And all allocated resource in queue_properties will not
be freed until the queue is destroyed.
But some properties(e.g., cu mask) bring some memory
management headaches(e.g., memory leak) and make code
complex. Actually they have been copied to mqd and
don't have to persist in queue_properties.
Add an argument into update queue to pass such properties,
then we can remove them from queue_properties.
v2: Don't use void *.
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Currently, all kfd BOs use same destruction routine. But pinned
BOs are not unpinned properly. Separate them from general routine.
v2 (Felix):
Add safeguard to prevent user space from freeing signal BO.
Kunmap signal BO in the event of setting event page error.
Just kunmap signal BO to avoid duplicating the code.
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
On some GPUs the PCIe atomic requirement for KFD depends on the MEC
firmware version. Add a firmware version check for this. The minimum
firmware version that works without atomics can be updated in the
device_info structure for each GPU type.
Move PCIe atomic detection from kgd2kfd_probe into kgd2kfd_device_init
because the MEC firmware is not loaded yet at the probe stage.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add u32 gfx_target_version field to kfd_node_properties and
kfd_device_info. Populate <asic>_device_info structs accordingly and
expose to sysfs.
This allows eliminating device-ID-based lookup tables in user mode for
future ASICs.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
start_cpsch and stop_cpsch can be called during kfd device
initialization or during gpu reset/recovery. So they can
run concurrently. Currently in start_cpsch and stop_cpsch,
pm_init and pm_uninit is not protected by the dpm lock.
Imagine such a case that user use packet manager's function
to submit a pm4 packet to hang hws (ie through command
cat /sys/class/kfd/kfd/topology/nodes/1/gpu_id | sudo tee
/sys/kernel/debug/kfd/hang_hws), while kfd device is under
device reset/recovery so packet manager can be not initialized.
There will be unpredictable protection fault in such case.
This patch moves pm_init/uninit inside the dpm lock and check
packet manager is initialized before using packet manager
function.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This is part of SVM profiling API, export sysfs counters for
per-process, per-GPU vm retry fault, pages migrated in and out of GPU vram.
counters will not be updated in parallel in GPU retry fault handler and
migration to vram/ram path, use READ_ONCE to avoid compiler
optimization.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When some GPUs don't support SVM, don't disabe it for the entire process.
That would be inconsistent with the information the process got from the
topology, which indicates SVM support per GPU.
Instead disable SVM support only for the unsupported GPUs. This is done
by checking any per-device attributes against the bitmap of supported
GPUs. Also use the supported GPU bitmap to initialize access bitmaps for
new SVM address ranges.
Don't handle recoverable page faults from unsupported GPUs. (I don't
think there will be unsupported GPUs that can generate recoverable page
faults. But better safe than sorry.)
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
It is to provide more tlb flush types option for different
case scenario.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The user applications maybe register the KFD_EVENT_TYPE_HW_EXCEPTION and
KFD_EVENT_TYPE_MEMORY events, driver could notify them when poison data
consumed. Beside that, some applications maybe register SIGBUS signal
hander. These applications will handle poison data by themselves, exit
or re-create context to re-dispatch works.
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In order to support multi-process debugging, HWS PM4 packet
MAP_PROCESS requires an extension of 5 DWORDS to support targeting of
per-vmid SPI debug control registers as well as watch points per process.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
With xnack on, GPU vm fault handler decide the best restore location,
then migrate range to the best restore location and update GPU mapping
to recover the GPU vm fault.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Register vram memory as MEMORY_DEVICE_PRIVATE type resource, to
allocate vram backing pages for page migration.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
XNACK mode controls the SQ RETRY_DISABLE setting that determines,
whether recoverable page faults can be supported on GFXv9 hardware.
Only on Aldebaran we can support different processes running with
different XNACK modes. On older chips all processes must use the same
RETRY_DISABLE setting. However, processes not relying on recoverable
page faults can work with RETRY enabled. This means XNACK off is always
available as a fallback so we can use the same mode on all GPUs in a
process.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
HMM interval notifier callback notify CPU page table will be updated,
stop process queues if the updated address belongs to svm range
registered in process svms objects tree. Scheduled restore work to
update GPU page table using new pages address in the updated svm range.
The restore worker flushes any deferred work to make sure it restores
an up-to-date svm_range_list.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When application explicitly call unmap or unmap from mmput when
application exit, driver will receive MMU_NOTIFY_UNMAP event to remove
svm range from process svms object tree and list first, unmap from GPUs
(in the following patch).
Split the svm ranges to handle partial unmapping of svm ranges. To
avoid deadlocks, updating MMU notifiers, range lists and interval trees
is done in a deferred worker. New child ranges are attached to their
parent range's child_list until the worker can update the
svm_range_list. svm_range_set_attr flushes deferred work and takes the
mmap_write_lock to guarantee that it has an up-to-date svm_range_list.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
svm range structure stores the range start address, size, attributes,
flags, prefetch location and gpu bitmap which indicates which GPU this
range maps to. Same virtual address is shared by CPU and GPUs.
Process has svm range list which uses both interval tree and list to
store all svm ranges registered by the process. Interval tree is used by
GPU vm fault handler and CPU page fault handler to get svm range
structure from the specific address. List is used to scan all ranges in
eviction restore work.
No overlap range interval [start, last] exist in svms object interval
tree. If process registers new range which has overlap with old range,
the old range split into 2 ranges depending on the overlap happens at
head or tail part of old range.
Apply attributes preferred location, prefetch location, mapping flags,
migration granularity to svm range, store mapping gpu index into bitmap.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add svm (shared virtual memory) ioctl data structure and API definition.
The svm ioctl API is designed to be extensible in the future. All
operations are provided by a single IOCTL to preserve ioctl number
space. The arguments structure ends with a variable size array of
attributes that can be used to set or get one or multiple attributes.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
svm range uses gpu bitmap to store which GPU svm range maps to.
Application pass driver gpu id to specify GPU, the helper is needed to
convert gpu id to gpu bitmap idx.
Access through kfd_process_device pointers array from kfd_process.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu needs the drm_priv to allow mmap
to access the BO through the corresponding file descriptor. The VM can
also be extracted from drm_priv, so drm_priv can replace the vm parameter
in the kfd2kgd interface.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Amdgpu driver uses 4-byte data type as DQM fence memory,
and transmits GPU address of fence memory to microcode
through query status PM4 message. However, query status
PM4 message definition and microcode processing are all
processed according to 8 bytes. Fence memory only allocates
4 bytes of memory, but microcode does write 8 bytes of memory,
so there is a memory corruption.
Changes since v1:
* Change dqm->fence_addr as a u64 pointer to fix this issue,
also fix up query_status and amdkfd_fence_wait_timeout function
uses 64 bit fence value to make them consistent.
Signed-off-by: Qu Huang <jinsdb@126.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove per_device_list from kfd_process and replace it with a
kfd_process_device pointers array of MAX_GPU_INSTANCES size. This helps
to manage the kfd_process_devices binded to a specific kfd_process.
Also, functions used by kfd_chardev to iterate over the list were
removed, since they are not valid anymore. Instead, it was replaced by a
local loop iterating the array.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This is to keep wavefront context for debug purpose
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Trap handler is set per-process per-device and is unrelated
to queue management.
Move implementation closer to TMA setup code.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.11-2020-11-05:
amdgpu:
- Add initial support for Vangogh
- Add support for Green Sardine
- Add initial support for Dimgrey Cavefish
- Scatter/Gather display support for Renoir
- Updates for Sienna Cichlid
- Updates for Navy Flounder
- SMU7 power improvements
- Modifier support for gfx9+
- CI BACO fixes
- Arcturus SMU fixes
- Lots of code cleanups
- DC fixes
- Kernel doc fixes
- Add more GPU HW client information to page fault error logging
- MPO clock tuning for RV
- FP fixes for DCN3 on ARM and PPC
radeon:
- Expose voltage via hwmon on Sumo APUs
amdkfd:
- Fix unique id handling
- Misc fixes
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201105222749.201798-1-alexander.deucher@amd.com
|