Age | Commit message (Collapse) | Author |
|
Because we record the mapping under non HWS mode in the software,
we can query pasid through vmid using the stored mapping instead of
reading from ATC registers.
This also prepares for the defeatured ATC block in future ASICs.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This makes possible the vmid pasid mapping query through software.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Since KFD pasid starts from 0x8000 (32768 in decimal), it is better
perceived as a hex number. Meanwhile, change the pasid type from
unsigned int to uint16_t to be consistent throughout the code.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The GFX10 does not require the control stack to be right after mqd
buffer any more, so move it back to usersapce allocated CSWR buffer.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
kfd needs drm_device to call into drm_cgroup functions
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This is required to check against cgroup permissions.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Keep the same use of CHIP_IDs for navi12 in kfd
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add device info for both navi12 PF and VF
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Currently this function pointer is missing for GFX10. Considering it is
a void function since GFX9, fix it by checking the function pointer
before dereferencing it.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The packet manager is not needed for non HWS mode except Hawaii, so only
initialize it for Hawaii under non HWS mode. This will simplify debugging
under non HWS mode for all new asics, because it eliminates one variable
out of the equation in non HWS mode
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The dozens of printing messages are compressed into 2 lines.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
alloc_workqueue is not checked for errors and as a result,
a potential NULL dereference could occur.
v2 (Felix Kuehling):
* Fix compile error (kfifo_free instead of fifo_free)
* Return proper error code
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
These were deleted before, but somehow showed up again. Delete them again.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Pull drm fixes from Dave Airlie:
"Fixes built up over the past 1.5 weeks or so, it's two weeks of
amdgpu, some core cleanups and some panfrost fixes. I also finally
figured out why my desktop was slow to do a bunch of stuff (someone
gave it an IPv6 address which can't reach anything!).
core:
- Some cleanups and fixes in the self-refresh helpers
- Some cleanups and fixes in the atomic helpers
amdgpu:
- Fix a 64 bit divide
- Prevent a memory leak in a failure case in dc
- Load proper gfx firmware on navi14 variants
- Add more navi12 and navi14 PCI ids
- Misc fixes for renoir
- Fix bandwidth issues with multiple displays on vega20
- Support for Dali
- Fix a possible oops with KFD on hawaii
- Fix for backlight level after resume on some APUs
- Other misc fixes
panfrost:
- Multiple panfrost fixes for regulator support and page fault
handling"
* tag 'drm-next-2019-09-27' of git://anongit.freedesktop.org/drm/drm: (34 commits)
drm/amd/display: prevent memory leak
drm/amdgpu/gfx10: add support for wks firmware loading
drm/amdgpu/display: include slab.h in dcn21_resource.c
drm/amdgpu/display: fix 64 bit divide
drm/panfrost: Prevent race when handling page fault
drm/panfrost: Remove NULL checks for regulator
drm/panfrost: Fix regulator_get_optional() misuse
drm: Measure Self Refresh Entry/Exit times to avoid thrashing
drm: Fix kerneldoc and remove unused struct member in self_refresh helper
drm/atomic: Rename crtc_state->pageflip_flags to async_flip
drm/atomic: Reject FLIP_ASYNC unconditionally
drm/atomic: Take the atomic toys away from X
drm/amdgpu: flag navi12 and 14 as experimental for 5.4
drm/kms: Duct-tape for mode object lifetime checks
drm/amdgpu: add navi12 pci id
drm/amdgpu: add navi14 PCI ID for work station SKU
drm/amdkfd: Swap trap temporary registers in gfx10 trap handler
drm/amd/powerplay: implement sysfs for getting dpm clock
drm/amd/display: Restore backlight brightness after system resume
drm/amd/display: Implement voltage limitation for dali
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull hmm updates from Jason Gunthorpe:
"This is more cleanup and consolidation of the hmm APIs and the very
strongly related mmu_notifier interfaces. Many places across the tree
using these interfaces are touched in the process. Beyond that a
cleanup to the page walker API and a few memremap related changes
round out the series:
- General improvement of hmm_range_fault() and related APIs, more
documentation, bug fixes from testing, API simplification &
consolidation, and unused API removal
- Simplify the hmm related kconfigs to HMM_MIRROR and DEVICE_PRIVATE,
and make them internal kconfig selects
- Hoist a lot of code related to mmu notifier attachment out of
drivers by using a refcount get/put attachment idiom and remove the
convoluted mmu_notifier_unregister_no_release() and related APIs.
- General API improvement for the migrate_vma API and revision of its
only user in nouveau
- Annotate mmu_notifiers with lockdep and sleeping region debugging
Two series unrelated to HMM or mmu_notifiers came along due to
dependencies:
- Allow pagemap's memremap_pages family of APIs to work without
providing a struct device
- Make walk_page_range() and related use a constant structure for
function pointers"
* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (75 commits)
libnvdimm: Enable unit test infrastructure compile checks
mm, notifier: Catch sleeping/blocking for !blockable
kernel.h: Add non_block_start/end()
drm/radeon: guard against calling an unpaired radeon_mn_unregister()
csky: add missing brackets in a macro for tlb.h
pagewalk: use lockdep_assert_held for locking validation
pagewalk: separate function pointers from iterator data
mm: split out a new pagewalk.h header from mm.h
mm/mmu_notifiers: annotate with might_sleep()
mm/mmu_notifiers: prime lockdep
mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end
mm/mmu_notifiers: remove the __mmu_notifier_invalidate_range_start/end exports
mm/hmm: hmm_range_fault() infinite loop
mm/hmm: hmm_range_fault() NULL pointer bug
mm/hmm: fix hmm_range_fault()'s handling of swapped out pages
mm/mmu_notifiers: remove unregister_no_release
RDMA/odp: remove ib_ucontext from ib_umem
RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm'
RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr
RDMA/mlx5: Use ib_umem_start instead of umem.address
...
|
|
Pull drm updates from Dave Airlie:
"This is the main pull request for 5.4-rc1 merge window. I don't think
there is anything outstanding so next week should just be fixes, but
we'll see if I missed anything. I landed some fixes earlier in the
week but got delayed writing summary and sending it out, due to a mix
of sick kid and jetlag!
There are some fixes pending, but I'd rather get the main merge out of
the way instead of delaying it longer.
It's also pretty large in commit count and new amd header file size.
The largest thing is four new amdgpu products (navi12/14, arcturus and
renoir APU support).
Otherwise it's pretty much lots of work across the board, i915 has
started landing tigerlake support, lots of icelake fixes and lots of
locking reworking for future gpu support, lots of header file rework
(drmP.h is nearly gone), some old legacy hacks (DRM_WAIT_ON) have been
put into the places they are needed.
uapi:
- content protection type property for HDCP
core:
- rework include dependencies
- lots of drmP.h removals
- link rate calculation robustness fix
- make fb helper map only when required
- add connector->DDC adapter link
- DRM_WAIT_ON removed
- drop DRM_AUTH usage from drivers
dma-buf:
- reservation object fence helper
dma-fence:
- shrink dma_fence struct
- merge signal functions
- store timestamps in dma_fence
- selftests
ttm:
- embed drm_get_object struct into ttm_buffer_object
- release_notify callback
bridges:
- sii902x - audio graph card support
- tc358767 - aux data handling rework
- ti-snd64dsi86 - debugfs support, DSI mode flags support
panels:
- Support for GiantPlus GPM940B0, Sharp LQ070Y3DG3B, Ortustech
COM37H3M, Novatek NT39016, Sharp LS020B1DD01D, Raydium RM67191, Boe
Himax8279d, Sharp LD-D5116Z01B
- TI nspire, NEC NL8048HL11, LG Philips LB035Q02, Sharp LS037V7DW01,
Sony ACX565AKM, Toppoly TD028TTEC1 Toppoly TD043MTEA1
i915:
- Initial tigerlake platform support
- Locking simplification work, general all over refactoring.
- Selftests
- HDCP debug info improvements
- DSI properties
- Icelake display PLL fixes, colorspace fixes, bandwidth fixes, DSI
suspend/resume
- GuC fixes
- Perf fixes
- ElkhartLake enablement
- DP MST fixes
- GVT - command parser enhancements
amdgpu:
- add wipe memory on release flag for buffer creation
- Navi12/14 support (may be marked experimental)
- Arcturus support
- Renoir APU support
- mclk DPM for Navi
- DC display fixes
- Raven scatter/gather support
- RAS support for GFX
- Navi12 + Arcturus power features
- GPU reset for Picasso
- smu11 i2c controller support
amdkfd:
- navi12/14 support
- Arcturus support
radeon:
- kexec fix
nouveau:
- improved display color management
- detect lack of GPU power cables
vmwgfx:
- evicition priority support
- remove unused security feature
msm:
- msm8998 display support
- better async commit support for cursor updates
etnaviv:
- per-process address space support
- performance counter fixes
- softpin support
mcde:
- DCS transfers fix
exynos:
- drmP.h cleanup
lima:
- reduce logging
kirin:
- misc clenaups
komeda:
- dual-link support
- DT memory regions
hisilicon:
- misc fixes
imx:
- IPUv3 image converter fixes
- 32-bit RGB V4L2 pixel format support
ingenic:
- more support for panel related cases
mgag200:
- cursor support fix
panfrost:
- export GPU features register to userspace
- gpu heap allocations
- per-fd address space support
pl111:
- CLD pads wiring support removed from DT
rockchip:
- rework to use DRM PSR helpers
- fix bug in VOP_WIN_GET macro
- DSI DT binding rework
sun4i:
- improve support for color encoding and range
- DDC enabled GPIO
tinydrm:
- rework SPI support
- improve MIPI-DBI support
- moved to drm/tiny
vkms:
- rework CRC tracking
dw-hdmi:
- get_eld and i2s improvements
gm12u320:
- misc fixes
meson:
- global code cleanup
- vpu feature detect
omap:
- alpha/pixel blend mode properties
rcar-du:
- misc fixes"
* tag 'drm-next-2019-09-18' of git://anongit.freedesktop.org/drm/drm: (2112 commits)
drm/nouveau/bar/gm20b: Avoid BAR1 teardown during init
drm/nouveau: Fix ordering between TTM and GEM release
drm/nouveau/prime: Extend DMA reservation object lock
drm/nouveau: Fix fallout from reservation object rework
drm/nouveau/kms/nv50-: Don't create MSTMs for eDP connectors
drm/i915: Use NOEVICT for first pass on attemping to pin a GGTT mmap
drm/i915: to make vgpu ppgtt notificaiton as atomic operation
drm/i915: Flush the existing fence before GGTT read/write
drm/i915: Hold irq-off for the entire fake lock period
drm/i915/gvt: update RING_START reg of vGPU when the context is submitted to i915
drm/i915/gvt: update vgpu workload head pointer correctly
drm/mcde: Fix DSI transfers
drm/msm: Use the correct dma_sync calls harder
drm/msm: remove unlikely() from WARN_ON() conditions
drm/msm/dsi: Fix return value check for clk_get_parent
drm/msm: add atomic traces
drm/msm/dpu: async commit support
drm/msm: async commit support
drm/msm: split power control from prepare/complete_commit
drm/msm: add kms->flush_commit()
...
|
|
ttmp[4:5] hold information useful to the debugger. Use ttmp[14:15]
instead, aligning implementation with gfx9 trap handler.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: shaoyun liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
ttmp[4:5] hold information useful to the debugger. Use ttmp[14:15]
instead, aligning implementation with gfx9 trap handler.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: shaoyun liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch fixes null pointer issue below, I missed to init the asic renior name
while I rebase the patches.
[ 106.004250] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 106.004254] #PF: supervisor read access in kernel mode
[ 106.004256] #PF: error_code(0x0000) - not-present page
[ 106.004257] PGD 0 P4D 0
[ 106.004261] Oops: 0000 [#1] SMP NOPTI
[ 106.004264] CPU: 3 PID: 1422 Comm: modprobe Not tainted 5.2.0-rc1-custom #1
[ 106.004266] Hardware name: AMD Celadon-RN/Celadon-RN, BIOS
WCD9814N_Weekly_19_08_1 08/14/2019
[ 106.004272] RIP: 0010:strncpy+0x12/0x30
[ 106.004274] Code: c1 c0 11 48 c1 c6 15 48 31 d0 48 c1 c2 20 31 c2 89 d0 31 f0
41 5c 5d c3 55 48 85 d2 48 89 f8 48 89 e5 74 1e 48 01 fa 48 89 f9 <44> 0f b6 06
41 80 f8 01 44 88 01 48 83 de ff 48 83 c1 01 48 39 d1
[ 106.004278] RSP: 0018:ffffc092c1fd37a8 EFLAGS: 00010286
[ 106.004281] RAX: ffff9e943466a28c RBX: 00000000000036ed RCX: ffff9e943466a28c
[ 106.004283] RDX: ffff9e943466a2ac RSI: 0000000000000000 RDI: ffff9e943466a28c
[ 106.004285] RBP: ffffc092c1fd37a8 R08: ffff9e943d100000 R09: 0000000000000228
[ 106.004287] R10: ffff9e94418dc5a8 R11: ffff9e944746c0d0 R12: 0000000000000000
[ 106.004289] R13: ffff9e943fa1ec00 R14: ffff9e943466a200 R15: ffff9e943466a200
[ 106.004291] FS: 00007f7a022c5540(0000) GS:ffff9e9447ac0000(0000)
knlGS:0000000000000000
[ 106.004294] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 106.004296] CR2: 0000000000000000 CR3: 00000001ff0b0000 CR4: 0000000000340ee0
[ 106.004298] Call Trace:
[ 106.004382] kfd_topology_add_device+0x150/0x610 [amdgpu]
[ 106.004445] kgd2kfd_device_init+0x2e0/0x4f0 [amdgpu]
[ 106.004509] amdgpu_amdkfd_device_init+0x14c/0x1b0 [amdgpu]
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-and-Tested-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch adds renoir kfd topology which is the same with Raven.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Renoir use GFX v9, so adds v9 package manager.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Renoir is GFX v9, so init v9 kernel queue.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Renoir is GMC v9, so init v9 kfd apertures.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Renoir is the same with Raven, will enable iommu event in future.
v2: fix the checking (Thong)
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Renoir is GFX9, so enable v9 devcie queue manager.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch inits renoir kfd device info, so we treat renoir as "dgpu"
(bypass iommu v2). Will enable needs_iommu_device till renoir iommu is ready.
v2: rebase and align the drm-next
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Renoir's cache info should be the same with raven and carrizo's.
v2: fix missed "break"
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Initial support of Navi14 in KFD. The device IDs will be added later.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The issue was accidentally introduced recently.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This optimizes out the pci device id usage in KFD and makes the code
more maintainable.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add the virtual function PCI device id.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Frank.Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix sparse warning:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:1846:6:
warning: symbol 'deallocate_hiq_sdma_mqd' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c: In function restore_process_worker:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:949:29: warning:
variable pdd set but not used [-Wunused-but-set-variable]
It is not used since
commit 5b87245faf57 ("drm/amdkfd: Simplify kfd2kgd interface")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The name field in node topology has not been used. We re-purpose it to
hold the asic name, which can be queried by user space applications
through sysfs.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
From rdma.git
Jason Gunthorpe says:
====================
This is a collection of general cleanups for ODP to clarify some of the
flows around umem creation and use of the interval tree.
====================
The branch is based on v5.3-rc5 due to dependencies, and is being taken
into hmm.git due to dependencies in the next patches.
* odp_fixes:
RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr
RDMA/mlx5: Use ib_umem_start instead of umem.address
RDMA/core: Make invalidate_range a device operation
RDMA/odp: Use kvcalloc for the dma_list and page_list
RDMA/odp: Check for overflow when computing the umem_odp end
RDMA/odp: Provide ib_umem_odp_release() to undo the allocs
RDMA/odp: Split creating a umem_odp from ib_umem_get
RDMA/odp: Make the three ways to create a umem_odp clear
RMDA/odp: Consolidate umem_odp initialization
RDMA/odp: Make it clearer when a umem is an implicit ODP umem
RDMA/odp: Iterate over the whole rbtree directly
RDMA/odp: Use the common interval tree library instead of generic
RDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLB
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The sequence of mmu_notifier_unregister_no_release(),
mmu_notifier_call_srcu() is identical to mmu_notifier_put() with the
free_notifier callback.
As this is the last user of those APIs, converting it means we can drop
them.
Link: https://lore.kernel.org/r/20190806231548.25242-11-jgg@ziepe.ca
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
When using mmu_notifer_unregister_no_release() the caller must ensure
there is a SRCU synchronize before the mn memory is freed, otherwise use
after free races are possible, for instance:
CPU0 CPU1
invalidate_range_start
hlist_for_each_entry_rcu(..)
mmu_notifier_unregister_no_release(&p->mn)
kfree(mn)
if (mn->ops->invalidate_range_end)
The error unwind in amdkfd misses the SRCU synchronization.
amdkfd keeps the kfd_process around until the mm is released, so split the
flow to fully initialize the kfd_process and register it for find_process,
and with the notifier. Past this point the kfd_process does not need to be
cleaned up as it is fully ready.
The final failable step does a vm_mmap() and does not seem to impact the
kfd_process global state. Since it also cannot be undone (and already has
problems with undo if it internally fails), it has to be last.
This way we don't have to try to unwind the mmu_notifier_register() and
avoid the problem with the SRCU.
Along the way this also fixes various other error unwind bugs in the flow.
Fixes: 45102048f77e ("amdkfd: Add process queue manager module")
Link: https://lore.kernel.org/r/20190806231548.25242-10-jgg@ziepe.ca
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The amdgpu_task_info will be used when printing VM page fault for KFD
processes.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanatha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Linux 5.3-rc3
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This reverts commit 1a058c3376765ee31d65e28cbbb9d4ff15120056.
This interface is still in too much flux. Revert until
it's sorted out.
Acked-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Following bitmap layout logic introduced by:
"drm/amdgpu: support get_cu_info for Arcturus".
v2: squash in fixup for gfx_v9_0.c (Alex)
v3: squash in debug print output fix
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
VCC moved out of user SGPR allocation in gfx10. It's now stored
in SGPRs 106-107.
Also fixes incorrect SGPR read offsets.
Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
These moved from SGPRs in gfx9 to HWREG in gfx10.
Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Copy/paste error, first 4 VGPRs are separated by 64 dwords (256 bytes).
Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Previously submitted code was taken from an incorrect branch and
was non-functional.
Cc: Oak Zeng <oak.zeng@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-By: Oak Zeng <oak.zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If the trap is entered due to MODE.DEBUG_EN=1 and SAVECTX is raised
concurrently the handler cannot identify the source of the exception.
This causes the debugger to lose single step exception notification
when a context save request arrives at the same time.
When MODE.DEBUG_EN=1 and STATUS.HALT=0 (exception not already handled)
jump to the second-level trap handler upon entering the trap. The
second-level trap will set STATUS.HALT=1 and return to the shader.
If SAVECTX was raised then control flow will return to the trap, which
will then handle the context save request.
Cc: Tony Tye <tony.tye@amd.com>
Cc: Laurent Morichetti <laurent.morichetti@amd.com>
Cc: Qingchuan Shi <qingchuan.shi@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Laurent Morichetti <laurent.morichetti@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When a wavefront raises TRAPSTS.XNACK_ERROR with STATUS.ALLOW_REPLAY=0
subsequent memory instructions have undefined behavior. In practice
SQC stores continue to work but TCP stores do not.
Context save is permitted to fail after XNACK error because the
wavefront will be halted and subsequently terminated. However the
debugger has an interest in retrieving the wavefront VGPR/LDS state.
Detect the out-of-spec case and use SQC stores during context save
in place of TCP stores.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In preparation to enabling -Wimplicit-fallthrough, this patch silences
the following warning:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_v10.c: In function ‘mqd_manager_init_v10’:
./include/linux/dynamic_debug.h:122:52: warning: this statement may fall through [-Wimplicit-fallthrough=]
#define __dynamic_func_call(id, fmt, func, ...) do { \
^
./include/linux/dynamic_debug.h:143:2: note: in expansion of macro ‘__dynamic_func_call’
__dynamic_func_call(__UNIQUE_ID(ddebug), fmt, func, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~~~~
./include/linux/dynamic_debug.h:153:2: note: in expansion of macro ‘_dynamic_func_call’
_dynamic_func_call(fmt, __dynamic_pr_debug, \
^~~~~~~~~~~~~~~~~~
./include/linux/printk.h:336:2: note: in expansion of macro ‘dynamic_pr_debug’
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_v10.c:432:3: note: in expansion of macro ‘pr_debug’
pr_debug("%s@%i\n", __func__, __LINE__);
^~~~~~~~
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_v10.c:433:2: note: here
case KFD_MQD_TYPE_COMPUTE:
^~~~
by removing the call to pr_debug() in KFD_MQD_TYPE_CP:
"The mqd init for CP and COMPUTE will have the same routine." [1]
This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.
[1] https://lore.kernel.org/lkml/c735a1cc-a545-50fb-44e7-c0ad93ee8ee7@amd.com/
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
|
|
Add missing break statement in order to prevent the code from falling
through to case CHIP_NAVI10.
This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.
Fixes: 14328aa58ce5 ("drm/amdkfd: Add navi10 support to amdkfd. (v3)")
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
|