diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2020-06-02 15:04:15 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2020-06-02 15:04:15 -0700 |
commit | faa392181a0bd42c5478175cef601adeecdc91b6 (patch) | |
tree | e020e1142e34786676d0cd40f539bccdbb66099e /drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | |
parent | cfa3b8068b09f25037146bfd5eed041b78878bee (diff) | |
parent | 9ca1f474cea0edc14a1d7ec933e5472c0ff115d3 (diff) |
Merge tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Highlights:
- Core DRM had a lot of refactoring around managed drm resources to
make drivers simpler.
- Intel Tigerlake support is on by default
- amdgpu now support p2p PCI buffer sharing and encrypted GPU memory
Details:
core:
- uapi: error out EBUSY when existing master
- uapi: rework SET/DROP MASTER permission handling
- remove drm_pci.h
- drm_pci* are now legacy
- introduced managed DRM resources
- subclassing support for drm_framebuffer
- simple encoder helper
- edid improvements
- vblank + writeback documentation improved
- drm/mm - optimise tree searches
- port drivers to use devm_drm_dev_alloc
dma-buf:
- add flag for p2p buffer support
mst:
- ACT timeout improvements
- remove drm_dp_mst_has_audio
- don't use 2nd TX slot - spec recommends against it
bridge:
- dw-hdmi various improvements
- chrontel ch7033 support
- fix stack issues with old gcc
hdmi:
- add unpack function for drm infoframe
fbdev:
- misc fbdev driver fixes
i915:
- uapi: global sseu pinning
- uapi: OA buffer polling
- uapi: remove generated perf code
- uapi: per-engine default property values in sysfs
- Tigerlake GEN12 enabled.
- Lots of gem refactoring
- Tigerlake enablement patches
- move to drm_device logging
- Icelake gamma HW readout
- push MST link retrain to hotplug work
- bandwidth atomic helpers
- ICL fixes
- RPS/GT refactoring
- Cherryview full-ppgtt support
- i915 locking guidelines documented
- require linear fb stride to be 512 multiple on gen9
- Tigerlake SAGV support
amdgpu:
- uapi: encrypted GPU memory handling
- uapi: add MEM_SYNC IB flag
- p2p dma-buf support
- export VRAM dma-bufs
- FRU chip access support
- RAS/SR-IOV updates
- Powerplay locking fixes
- VCN DPG (powergating) enablement
- GFX10 clockgating fixes
- DC fixes
- GPU reset fixes
- navi SDMA fix
- expose FP16 for modesetting
- DP 1.4 compliance fixes
- gfx10 soft recovery
- Improved Critical Thermal Faults handling
- resizable BAR on gmc10
amdkfd:
- uapi: GWS resource management
- track GPU memory per process
- report PCI domain in topology
radeon:
- safe reg list generator fixes
nouveau:
- HD audio fixes on recent systems
- vGPU detection (fail probe if we're on one, for now)
- Interlaced mode fixes (mostly avoidance on Turing, which doesn't support it)
- SVM improvements/fixes
- NVIDIA format modifier support
- Misc other fixes.
adv7511:
- HDMI SPDIF support
ast:
- allocate crtc state size
- fix double assignment
- fix suspend
bochs:
- drop connector register
cirrus:
- move to tiny drivers.
exynos:
- fix imported dma-buf mapping
- enable runtime PM
- fixes and cleanups
mediatek:
- DPI pin mode swap
- config mipi_tx current/impedance
lima:
- devfreq + cooling device support
- task handling improvements
- runtime PM support
pl111:
- vexpress init improvements
- fix module auto-load
rcar-du:
- DT bindings conversion to YAML
- Planes zpos sanity check and fix
- MAINTAINERS entry for LVDS panel driver
mcde:
- fix return value
mgag200:
- use managed config init
stm:
- read endpoints from DT
vboxvideo:
- use PCI managed functions
- drop WC mtrr
vkms:
- enable cursor by default
rockchip:
- afbc support
virtio:
- various cleanups
qxl:
- fix cursor notify port
hisilicon:
- 128-byte stride alignment fix
sun4i:
- improved format handling"
* tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drm: (1401 commits)
drm/amd/display: Fix potential integer wraparound resulting in a hang
drm/amd/display: drop cursor position check in atomic test
drm/amdgpu: fix device attribute node create failed with multi gpu
drm/nouveau: use correct conflicting framebuffer API
drm/vblank: Fix -Wformat compile warnings on some arches
drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode
drm/amd/display: Handle GPU reset for DC block
drm/amdgpu: add apu flags (v2)
drm/amd/powerpay: Disable gfxoff when setting manual mode on picasso and raven
drm/amdgpu: fix pm sysfs node handling (v2)
drm/amdgpu: move gpu_info parsing after common early init
drm/amdgpu: move discovery gfx config fetching
drm/nouveau/dispnv50: fix runtime pm imbalance on error
drm/nouveau: fix runtime pm imbalance on error
drm/nouveau: fix runtime pm imbalance on error
drm/nouveau/debugfs: fix runtime pm imbalance on error
drm/nouveau/nouveau/hmm: fix migrate zero page to GPU
drm/nouveau/nouveau/hmm: fix nouveau_dmem_chunk allocations
drm/nouveau/kms/nv50-: Share DP SST mode_valid() handling with MST
drm/nouveau/kms/nv50-: Move 8BPC limit for MST into nv50_mstc_get_modes()
...
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c')
-rw-r--r-- | drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 78 |
1 files changed, 56 insertions, 22 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c index fc32586ef80b1..1d4128227ffd6 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c @@ -888,7 +888,8 @@ static int gfx_v8_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) gpu_addr = adev->wb.gpu_addr + (index * 4); adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD); memset(&ib, 0, sizeof(ib)); - r = amdgpu_ib_get(adev, NULL, 16, &ib); + r = amdgpu_ib_get(adev, NULL, 16, + AMDGPU_IB_POOL_DIRECT, &ib); if (r) goto err1; @@ -1550,7 +1551,8 @@ static int gfx_v8_0_do_edc_gpr_workarounds(struct amdgpu_device *adev) /* allocate an indirect buffer to put the commands in */ memset(&ib, 0, sizeof(ib)); - r = amdgpu_ib_get(adev, NULL, total_size, &ib); + r = amdgpu_ib_get(adev, NULL, total_size, + AMDGPU_IB_POOL_DIRECT, &ib); if (r) { DRM_ERROR("amdgpu: failed to get ib (%d).\n", r); return r; @@ -1892,6 +1894,7 @@ static int gfx_v8_0_compute_ring_init(struct amdgpu_device *adev, int ring_id, int r; unsigned irq_type; struct amdgpu_ring *ring = &adev->gfx.compute_ring[ring_id]; + unsigned int hw_prio; ring = &adev->gfx.compute_ring[ring_id]; @@ -1911,9 +1914,11 @@ static int gfx_v8_0_compute_ring_init(struct amdgpu_device *adev, int ring_id, + ((ring->me - 1) * adev->gfx.mec.num_pipe_per_mec) + ring->pipe; + hw_prio = amdgpu_gfx_is_high_priority_compute_queue(adev, ring->queue) ? + AMDGPU_GFX_PIPE_PRIO_HIGH : AMDGPU_RING_PRIO_DEFAULT; /* type-2 packets are deprecated on MEC, use type-3 instead */ r = amdgpu_ring_init(adev, ring, 1024, - &adev->gfx.eop_irq, irq_type); + &adev->gfx.eop_irq, irq_type, hw_prio); if (r) return r; @@ -2017,7 +2022,8 @@ static int gfx_v8_0_sw_init(void *handle) } r = amdgpu_ring_init(adev, ring, 1024, &adev->gfx.eop_irq, - AMDGPU_CP_IRQ_GFX_ME0_PIPE0_EOP); + AMDGPU_CP_IRQ_GFX_ME0_PIPE0_EOP, + AMDGPU_RING_PRIO_DEFAULT); if (r) return r; } @@ -4120,7 +4126,6 @@ static int gfx_v8_0_rlc_resume(struct amdgpu_device *adev) static void gfx_v8_0_cp_gfx_enable(struct amdgpu_device *adev, bool enable) { - int i; u32 tmp = RREG32(mmCP_ME_CNTL); if (enable) { @@ -4131,8 +4136,6 @@ static void gfx_v8_0_cp_gfx_enable(struct amdgpu_device *adev, bool enable) tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, ME_HALT, 1); tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, PFP_HALT, 1); tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, CE_HALT, 1); - for (i = 0; i < adev->gfx.num_gfx_rings; i++) - adev->gfx.gfx_ring[i].sched.ready = false; } WREG32(mmCP_ME_CNTL, tmp); udelay(50); @@ -4320,14 +4323,10 @@ static int gfx_v8_0_cp_gfx_resume(struct amdgpu_device *adev) static void gfx_v8_0_cp_compute_enable(struct amdgpu_device *adev, bool enable) { - int i; - if (enable) { WREG32(mmCP_MEC_CNTL, 0); } else { WREG32(mmCP_MEC_CNTL, (CP_MEC_CNTL__MEC_ME1_HALT_MASK | CP_MEC_CNTL__MEC_ME2_HALT_MASK)); - for (i = 0; i < adev->gfx.num_compute_rings; i++) - adev->gfx.compute_ring[i].sched.ready = false; adev->gfx.kiq.ring.sched.ready = false; } udelay(50); @@ -4437,11 +4436,8 @@ static void gfx_v8_0_mqd_set_priority(struct amdgpu_ring *ring, struct vi_mqd *m if (ring->funcs->type == AMDGPU_RING_TYPE_COMPUTE) { if (amdgpu_gfx_is_high_priority_compute_queue(adev, ring->queue)) { mqd->cp_hqd_pipe_priority = AMDGPU_GFX_PIPE_PRIO_HIGH; - ring->has_high_prio = true; mqd->cp_hqd_queue_priority = AMDGPU_GFX_QUEUE_PRIORITY_MAXIMUM; - } else { - ring->has_high_prio = false; } } } @@ -5619,12 +5615,18 @@ static void gfx_v8_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid) { u32 data; - data = RREG32(mmRLC_SPM_VMID); + if (amdgpu_sriov_is_pp_one_vf(adev)) + data = RREG32_NO_KIQ(mmRLC_SPM_VMID); + else + data = RREG32(mmRLC_SPM_VMID); data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK; data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << RLC_SPM_VMID__RLC_SPM_VMID__SHIFT; - WREG32(mmRLC_SPM_VMID, data); + if (amdgpu_sriov_is_pp_one_vf(adev)) + WREG32_NO_KIQ(mmRLC_SPM_VMID, data); + else + WREG32(mmRLC_SPM_VMID, data); } static const struct amdgpu_rlc_funcs iceland_rlc_funcs = { @@ -6387,10 +6389,10 @@ static void gfx_v8_0_ring_emit_patch_cond_exec(struct amdgpu_ring *ring, unsigne ring->ring[offset] = (ring->ring_size >> 2) - offset + cur; } -static void gfx_v8_0_ring_emit_rreg(struct amdgpu_ring *ring, uint32_t reg) +static void gfx_v8_0_ring_emit_rreg(struct amdgpu_ring *ring, uint32_t reg, + uint32_t reg_val_offs) { struct amdgpu_device *adev = ring->adev; - struct amdgpu_kiq *kiq = &adev->gfx.kiq; amdgpu_ring_write(ring, PACKET3(PACKET3_COPY_DATA, 4)); amdgpu_ring_write(ring, 0 | /* src: register*/ @@ -6399,9 +6401,9 @@ static void gfx_v8_0_ring_emit_rreg(struct amdgpu_ring *ring, uint32_t reg) amdgpu_ring_write(ring, reg); amdgpu_ring_write(ring, 0); amdgpu_ring_write(ring, lower_32_bits(adev->wb.gpu_addr + - kiq->reg_val_offs * 4)); + reg_val_offs * 4)); amdgpu_ring_write(ring, upper_32_bits(adev->wb.gpu_addr + - kiq->reg_val_offs * 4)); + reg_val_offs * 4)); } static void gfx_v8_0_ring_emit_wreg(struct amdgpu_ring *ring, uint32_t reg, @@ -6815,6 +6817,34 @@ static int gfx_v8_0_sq_irq(struct amdgpu_device *adev, return 0; } +static void gfx_v8_0_emit_mem_sync(struct amdgpu_ring *ring) +{ + amdgpu_ring_write(ring, PACKET3(PACKET3_SURFACE_SYNC, 3)); + amdgpu_ring_write(ring, PACKET3_TCL1_ACTION_ENA | + PACKET3_TC_ACTION_ENA | + PACKET3_SH_KCACHE_ACTION_ENA | + PACKET3_SH_ICACHE_ACTION_ENA | + PACKET3_TC_WB_ACTION_ENA); /* CP_COHER_CNTL */ + amdgpu_ring_write(ring, 0xffffffff); /* CP_COHER_SIZE */ + amdgpu_ring_write(ring, 0); /* CP_COHER_BASE */ + amdgpu_ring_write(ring, 0x0000000A); /* poll interval */ +} + +static void gfx_v8_0_emit_mem_sync_compute(struct amdgpu_ring *ring) +{ + amdgpu_ring_write(ring, PACKET3(PACKET3_ACQUIRE_MEM, 5)); + amdgpu_ring_write(ring, PACKET3_TCL1_ACTION_ENA | + PACKET3_TC_ACTION_ENA | + PACKET3_SH_KCACHE_ACTION_ENA | + PACKET3_SH_ICACHE_ACTION_ENA | + PACKET3_TC_WB_ACTION_ENA); /* CP_COHER_CNTL */ + amdgpu_ring_write(ring, 0xffffffff); /* CP_COHER_SIZE */ + amdgpu_ring_write(ring, 0xff); /* CP_COHER_SIZE_HI */ + amdgpu_ring_write(ring, 0); /* CP_COHER_BASE */ + amdgpu_ring_write(ring, 0); /* CP_COHER_BASE_HI */ + amdgpu_ring_write(ring, 0x0000000A); /* poll interval */ +} + static const struct amd_ip_funcs gfx_v8_0_ip_funcs = { .name = "gfx_v8_0", .early_init = gfx_v8_0_early_init, @@ -6861,7 +6891,8 @@ static const struct amdgpu_ring_funcs gfx_v8_0_ring_funcs_gfx = { 3 + /* CNTX_CTRL */ 5 + /* HDP_INVL */ 12 + 12 + /* FENCE x2 */ - 2, /* SWITCH_BUFFER */ + 2 + /* SWITCH_BUFFER */ + 5, /* SURFACE_SYNC */ .emit_ib_size = 4, /* gfx_v8_0_ring_emit_ib_gfx */ .emit_ib = gfx_v8_0_ring_emit_ib_gfx, .emit_fence = gfx_v8_0_ring_emit_fence_gfx, @@ -6879,6 +6910,7 @@ static const struct amdgpu_ring_funcs gfx_v8_0_ring_funcs_gfx = { .patch_cond_exec = gfx_v8_0_ring_emit_patch_cond_exec, .emit_wreg = gfx_v8_0_ring_emit_wreg, .soft_recovery = gfx_v8_0_ring_soft_recovery, + .emit_mem_sync = gfx_v8_0_emit_mem_sync, }; static const struct amdgpu_ring_funcs gfx_v8_0_ring_funcs_compute = { @@ -6895,7 +6927,8 @@ static const struct amdgpu_ring_funcs gfx_v8_0_ring_funcs_compute = { 5 + /* hdp_invalidate */ 7 + /* gfx_v8_0_ring_emit_pipeline_sync */ VI_FLUSH_GPU_TLB_NUM_WREG * 5 + 7 + /* gfx_v8_0_ring_emit_vm_flush */ - 7 + 7 + 7, /* gfx_v8_0_ring_emit_fence_compute x3 for user fence, vm fence */ + 7 + 7 + 7 + /* gfx_v8_0_ring_emit_fence_compute x3 for user fence, vm fence */ + 7, /* gfx_v8_0_emit_mem_sync_compute */ .emit_ib_size = 7, /* gfx_v8_0_ring_emit_ib_compute */ .emit_ib = gfx_v8_0_ring_emit_ib_compute, .emit_fence = gfx_v8_0_ring_emit_fence_compute, @@ -6908,6 +6941,7 @@ static const struct amdgpu_ring_funcs gfx_v8_0_ring_funcs_compute = { .insert_nop = amdgpu_ring_insert_nop, .pad_ib = amdgpu_ring_generic_pad_ib, .emit_wreg = gfx_v8_0_ring_emit_wreg, + .emit_mem_sync = gfx_v8_0_emit_mem_sync_compute, }; static const struct amdgpu_ring_funcs gfx_v8_0_ring_funcs_kiq = { |