summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gt/intel_ggtt.c
AgeCommit message (Collapse)Author
2022-01-18drm/i915: Add object locking to i915_gem_evict_for_node and ↵Maarten Lankhorst
i915_gem_evict_something, v2. Because we will start to require the obj->resv lock for unbinding, ensure these vma eviction utility functions also take the lock. This requires some function signature changes, to ensure that the ww context is passed around, but is mostly straightforward. Previously this was split up into several patches, but reworking should allow for easier bisection. Changes since v1: - Handle evicting dead objects better. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-4-maarten.lankhorst@linux.intel.com
2022-01-18Merge drm/drm-next into drm-intel-gt-nextTvrtko Ursulin
Maarten needs backmerge to account for header file renames/changes which landed via drm-intel-next and are interfering with his pinning work. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-01-11drm/i915: Use vma resources for async unbindingThomas Hellström
Implement async (non-blocking) unbinding by not syncing the vma before calling unbind on the vma_resource. Add the resulting unbind fence to the object's dma_resv from where it is picked up by the ttm migration code. Ideally these unbind fences should be coalesced with the migration blit fence to avoid stalling the migration blit waiting for unbind, as they can certainly go on in parallel, but since we don't yet have a reasonable data structure to use to coalesce fences and attach the resulting fence to a timeline, we defer that for now. Note that with async unbinding, even while the unbind waits for the preceding bind to complete before unbinding, the vma itself might have been destroyed in the process, clearing the vma pages. Therefore we can only allow async unbinding if we have a refcounted sg-list and keep a refcount on that for the vma resource pages to stay intact until binding occurs. If this condition is not met, a request for an async unbind is diverted to a sync unbind. v2: - Use a separate kmem_cache for vma resources for now to isolate their memory allocation and aid debugging. - Move the check for vm closed to the actual unbinding thread. Regardless of whether the vm is closed, we need the unbind fence to properly wait for capture. - Clear vma_res::vm on unbind and update its documentation. v4: - Take cache coloring into account when searching for vma resources pending unbind. (Matthew Auld) v5: - Fix timeout and error check in i915_vma_resource_bind_dep_await(). - Avoid taking a reference on the object for async binding if async unbind capable. - Fix braces around a single-line if statement. v6: - Fix up the cache coloring adjustment. (Kernel test robot <lkp@intel.com>) - Don't allow async unbinding if the vma_res pages are not the same as the object pages. (Matthew Auld) v7: - s/unsigned long/u64/ in a number of places (Matthew Auld) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-5-thomas.hellstrom@linux.intel.com
2022-01-11drm/i915: Use the vma resource as argument for gtt binding / unbindingThomas Hellström
When introducing asynchronous unbinding, the vma itself may no longer be alive when the actual binding or unbinding takes place. Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource instead of a struct i915_vma for the bind_vma() and unbind_vma() ops. Similarly change the insert_entries() op for struct i915_address_space. Replace a couple of i915_vma_snapshot members with their newly introduced i915_vma_resource counterparts, since they have the same lifetime. Also make sure to avoid changing the struct i915_vma_flags (in particular the bind flags) async. That should now only be done sync under the vm mutex. v2: - Update the vma_res::bound_flags when binding to the aliased ggtt v6: - Remove I915_VMA_ALLOC_BIT (Matthew Auld) - Change some members of struct i915_vma_resource from unsigned long to u64 (Matthew Auld) v7: - Fix vma resource size parameters to be u64 rather than unsigned long (Matthew Auld) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-3-thomas.hellstrom@linux.intel.com
2022-01-05drm/i915/gt: Use to_gt() helper for GGTT accessesMichał Winiarski
GGTT is currently available both through i915->ggtt and gt->ggtt, and we eventually want to get rid of the i915->ggtt one. Use to_gt() for all i915->ggtt accesses to help with the future refactoring. During the probe of i915 the early intiialization of the gt (intel_gt_init_hw_early()) is moved prior to any access to the ggtt. This because it's in that moment we assign the ggtt to the gt and we want to do that before using it. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211221195946.3180-1-andi.shyti@linux.intel.com
2021-12-24Merge tag 'drm-intel-gt-next-2021-12-23' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Driver Changes: - Added bits of DG2 support around page table handling (Stuart Summers, Matthew Auld) - Fixed wakeref leak in PMU busyness during reset in GuC mode (Umesh Nerlige Ramappa) - Fixed debugfs access crash if GuC failed to load (John Harrison) - Bring back GuC error log to error capture, undoing accidental earlier breakage (Thomas Hellström) - Fixed memory leak in error capture caused by earlier refactoring (Thomas Hellström) - Exclude reserved stolen from driver use (Chris Wilson) - Add memory region sanity checking and optional full test (Chris Wilson) - Fixed buffer size truncation in TTM shmemfs backend (Robert Beckett) - Use correct lock and don't overwrite internal data structures when stealing GuC context ids (Matthew Brost) - Don't hog IRQs when destroying GuC contexts (John Harrison) - Make GuC to Host communication more robust (Matthew Brost) - Continuation of locking refactoring around VMA and backing store handling (Maarten Lankhorst) - Improve performance of reading GuC log from debugfs (John Harrison) - Log when GuC fails to reset an engine (John Harrison) - Speed up GuC/HuC firmware loading by requesting RP0 (Vinay Belgaumkar) - Further work on asynchronous VMA unbinding (Thomas Hellström, Christian König) - Refactor GuC/HuC firmware handling to prepare for future platforms (John Harrison) - Prepare for future different GuC/HuC firmware signing key sizes (Daniele Ceraolo Spurio, Michal Wajdeczko) - Add noreclaim annotations (Matthew Auld) - Remove racey GEM_BUG_ON between GPU reset and GuC communication handling (Matthew Brost) - Refactor i915->gt with to_gt(i915) to prepare for future platforms (Michał Winiarski, Andi Shyti) - Increase GuC log size for CONFIG_DEBUG_GEM (John Harrison) - Fixed engine busyness in selftests when in GuC mode (Umesh Nerlige Ramappa) - Make engine parking work with PREEMPT_RT (Sebastian Andrzej Siewior) - Replace X86_FEATURE_PAT with pat_enabled() (Lucas De Marchi) - Selftest for stealing of guc ids (Matthew Brost) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YcRvKO5cyPvIxVCi@tursulin-mobl2
2021-12-20drm/i915: Remove pages_mutex and intel_gtt->vma_ops.set/clear_pages members, v3.Maarten Lankhorst
Big delta, but boils down to moving set_pages to i915_vma.c, and removing the special handling, all callers use the defaults anyway. We only remap in ggtt, so default case will fall through. Because we still don't require locking in i915_vma_unpin(), handle this by using xchg in get_pages(), as it's locked with obj->mutex, and cmpxchg in unpin, which only fails if we race a against a new pin. Changes since v1: - aliasing gtt sets ZERO_SIZE_PTR, not -ENODEV, remove special case from __i915_vma_get_pages(). (Matt) Changes since v2: - Free correct old pages in __i915_vma_get_pages(). (Matt) Remove race of clearing vma->pages accidentally from put, free it but leave it set, as only get has the lock. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-4-maarten.lankhorst@linux.intel.com Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2021-12-17drm/i915/gt: Use to_gt() helperMichał Winiarski
Use to_gt() helper consistently throughout the codebase. Pure mechanical s/i915->gt/to_gt(i915). No functional changes. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211214193346.21231-5-andi.shyti@linux.intel.com
2021-12-10Merge tag 'drm-intel-gt-next-2021-12-09' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Core Changes: - Fix PENDING_ERROR leak in dma_fence_array_signaled() (Thomas Hellström) Driver Changes: - Fix runtime PM handling during PXP suspend (Tejas Upadhyay) - Improve eviction performance on discrete by implementing async TTM moves (Thomas Hellström, Maarten Lankhorst) - Improve robustness of error capture under memory pressure (Thomas Hellström) - Fix GuC PMU versus GPU reset handling (Umesh Nerlige Ramappa) - Use per device iommu check (Tvrtko Ursulin) - Make error capture work with async migration (Thomas Hellström) - Revert incorrect implementation of Wa_1508744258 causing hangs (José Roberto de Souza) - Disable coarse power gating on some DG2 steppings workaround (Matt Roper) - Add IC cache invalidation workaround on DG2 (Ramalingam C) - Move two Icelake workarounds to the right place (Raviteja Goud Talla) - Fix error pointer dereference in i915_gem_do_execbuffer() (Dan Carpenter) - Fixup a couple of generic and DG2 specific issues in migration code (Matthew Auld) - Fix kernel-doc warnings in i915_drm_object.c (Randy Dunlap) - Drop stealing of bits from i915_sw_fence function pointer (Matthew Brost) - Introduce new macros for i915 PTE (Michael Cheng) - Prep work for engine reset by reset domain lookup (Tejas Upadhyay) - Fixup drm-intel-gt-next build failure (Matthew Auld) - Fix live_engine_busy_stats selftests in GuC mode (Umesh Nerlige Ramappa) - Remove dma_resv_prune (Maarten Lankhorst) - Preserve huge pages enablement after driver reload (Matthew Auld) - Fix a NULL pointer dereference in igt_request_rewind() (selftests) (Zhou Qingyang) - Add workaround numbers to GEN7_COMMON_SLICE_CHICKEN1 whitelisting (José Roberto de Souza) - Increase timeouts in i915_gem_contexts selftests to handle GuC being slower (Bruce Chang) Signed-off-by: Dave Airlie <airlied@redhat.com> # Conflicts: # drivers/gpu/drm/i915/display/intel_fbc.c From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YbIBOeqhn+nPzaYD@tursulin-mobl2
2021-12-09drm/i915/gtt/xehpsdv: move scratch page to system memoryMatthew Auld
On some platforms the hw has dropped support for 4K GTT pages when dealing with LMEM, and due to the design of 64K GTT pages in the hw, we can only mark the *entire* page-table as operating in 64K GTT mode, since the enable bit is still on the pde, and not the pte. And since we we still need to allow 4K GTT pages for SMEM objects, we can't have a "normal" 4K page-table with scratch pointing to LMEM, since that's undefined from the hw pov. The simplest solution is to just move the 64K scratch page to SMEM on such platforms and call it a day, since that should work for all configurations. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Ramalingam C <ramalingam.c@intel.com> Reviewed-by: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211208141613.7251-4-ramalingam.c@intel.com
2021-12-06drm/i915: Introduce new macros for i915 PTEMichael Cheng
Certain functions within i915 uses macros that are defined for specific architectures by the mmu, such as _PAGE_RW and _PAGE_PRESENT (Some architectures don't even have these macros defined, like ARM64). Instead of re-using bits defined for the CPU, we should use bits defined for i915. This patch introduces two new 64 bit macros, GEN8_PAGE_PRESENT and GEN8_PAGE_RW, to check for bits 0 and 1 and, to replace all occurrences of _PAGE_RW and _PAGE_PRESENT within i915. v2(Michael Cheng): Use GEN8_ instead of I915_ Signed-off-by: Michael Cheng <michael.cheng@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> [ Move defines together with other GEN8 defines ] Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211206215245.513677-2-michael.cheng@intel.com
2021-12-01drm/i915: Use per device iommu checkTvrtko Ursulin
With both integrated and discrete Intel GPUs in a system, the current global check of intel_iommu_gfx_mapped, as done from intel_vtd_active() may not be completely accurate. In this patch we add i915 parameter to intel_vtd_active() in order to prepare it for multiple GPUs and we also change the check away from Intel specific intel_iommu_gfx_mapped (global exported by the Intel IOMMU driver) to probing the presence of IOMMU on a specific device using device_iommu_mapped(). This will return true both for IOMMU pass-through and address translation modes which matches the current behaviour. If in the future we wanted to distinguish between these two modes we could either use iommu_get_domain_for_dev() and check for __IOMMU_DOMAIN_PAGING bit indicating address translation, or ask for a new API to be exported from the IOMMU core code. v2: * Check for dmar translation specifically, not just iommu domain. (Baolu) v3: * Go back to plain "any domain" check for now, rewrite commit message. v4: * Use device_iommu_mapped. (Robin, Baolu) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Robin Murphy <robin.murphy@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211126141424.493753-1-tvrtko.ursulin@linux.intel.com
2021-11-15agp/intel-gtt: reduce intel-gtt dependencies moreJani Nikula
Don't include stuff on behalf of users if they're not strictly necessary for the header. Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/7bcaa1684587b9b008d3c41468fb40e63c54fbc7.1636977089.git.jani.nikula@intel.com
2021-11-15drm/i915: include intel-gtt.h only where neededJani Nikula
Only intel_gt.c and intel_ggtt.c need the interface. Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/034f57db24d6936ac2e4e6830261d791240cdd79.1636977089.git.jani.nikula@intel.com
2021-11-09drm/i915/adlp/fb: Prevent the mapping of redundant trailing padding NULL pagesImre Deak
So far the remapped view size in GTT/DPT was padded to the next aligned offset unnecessarily after the last color plane with an unaligned size. Remove the unnecessary padding. Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Fixes: 3d1adc3d64cf ("drm/i915/adlp: Add support for remapping CCS FBs") Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-3-imre.deak@intel.com (cherry picked from commit 6b6636e17649d75b4d0cc55d3dff9e44511a442a) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-11-03drm/i915: Factor out i915_ggtt_suspend_vm/i915_ggtt_resume_vm()Imre Deak
Factor out functions that are needed by the next patch to suspend/resume the memory mappings for DPT FBs. No functional change, except reordering during suspend the ggtt->invalidate(ggtt) call wrt. atomic_set(&ggtt->vm.open, open) and mutex_unlock(&ggtt->vm.mutex). This shouldn't matter due to the i915 suspend sequence being single threaded. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211101183551.3580546-1-imre.deak@intel.com
2021-11-02drm/i915/adlp/fb: Fix remapping of linear CCS AUX surfacesImre Deak
During remapping CCS FBs the CCS AUX surface mapped size and offset->x,y coordinate calculations assumed a tiled layout. This works as long as the CCS surface height is aligned to 64 lines (ensuring a 4k bytes CCS surface tile layout). However this alignment is not required by the HW (and the driver doesn't enforces it either). Add the remapping logic required to remap the pages of CCS surfaces without the above alignment, assuming the natural linear layout of the CCS surface (vs. tiled main surface layout). Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Fixes: 3d1adc3d64cf ("drm/i915/adlp: Add support for remapping CCS FBs") Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-5-imre.deak@intel.com
2021-11-02drm/i915/fb: Factor out functions to remap contiguous FB obj pagesImre Deak
Factor out functions needed to map contiguous FB obj pages to a GTT/DPT VMA view in the next patch. While at it s/4096/I915_GTT_PAGE_SIZE/ in add_padding_pages(). No functional changes. v2: s/4096/I915_GTT_PAGE_SIZE/ (Matthew) Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-4-imre.deak@intel.com
2021-11-02drm/i915/adlp/fb: Prevent the mapping of redundant trailing padding NULL pagesImre Deak
So far the remapped view size in GTT/DPT was padded to the next aligned offset unnecessarily after the last color plane with an unaligned size. Remove the unnecessary padding. Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Fixes: 3d1adc3d64cf ("drm/i915/adlp: Add support for remapping CCS FBs") Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-3-imre.deak@intel.com
2021-10-11Merge tag 'drm-intel-gt-next-2021-10-08' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next UAPI Changes: - Add uAPI for using PXP protected objects Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8064 - Add PCI IDs and LMEM discovery/placement uAPI for DG1 Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11584 - Disable engine bonding on Gen12+ except TGL, RKL and ADL-S Cross-subsystem Changes: - Merges 'tip/locking/wwmutex' branch (core kernel tip) - "mei: pxp: export pavp client to me client bus" Core Changes: - Update ttm_move_memcpy for async use (Thomas) Driver Changes: - Enable GuC submission by default on DG1 (Matt B) - Add PXP (Protected Xe Path) support for Gen12 integrated (Daniele, Sean, Anshuman) See "drm/i915/pxp: add PXP documentation" for details! - Remove force_probe protection for ADL-S (Raviteja) - Add base support for XeHP/XeHP SDV (Matt R, Stuart, Lucas) - Handle DRI_PRIME=1 on Intel igfx + Intel dgfx hybrid graphics setup (Tvrtko) - Use Transparent Hugepages when IOMMU is enabled (Tvrtko, Chris) - Implement LMEM backup and restore for suspend / resume (Thomas) - Report INSTDONE_GEOM values in error state for DG2 (Matt R) - Add DG2-specific shadow register table (Matt R) - Update Gen11/Gen12/XeHP shadow register tables (Matt R) - Maintain backward-compatible nested batch behavior on TGL+ (Matt R) - Add new LRI reg offsets for DG2 (Akeem) - Initialize unused MOCS entries to device specific values (Ayaz) - Track and use the correct UC MOCS index on Gen12 (Ayaz) - Add separate MOCS table for Gen12 devices other than TGL/RKL (Ayaz) - Simplify the locking and eliminate some RCU usage (Daniel) - Add some flushing for the 64K GTT path (Matt A) - Mark GPU wedging on driver unregister unrecoverable (Janusz) - Major rework in the GuC codebase, simplify locking and add docs (Matt B) - Add DG1 GuC/HuC firmwares (Daniele, Matt B) - Remember to call i915_sw_fence_fini on guc_state.blocked (Matt A) - Use "gt" forcewake domain name for error messages instead of "blitter" (Matt R) - Drop now duplicate LMEM uAPI RFC kerneldoc section (Daniel) - Fix early tracepoints for requests (Matt A) - Use locked access to ctx->engines in set_priority (Daniel) - Convert gen6/gen7/gen8 read operations to fwtable (Matt R) - Drop gen11/gen12 specific mmio write handlers (Matt R) - Drop gen11 specific mmio read handlers (Matt R) - Use designated initializers for init/exit table (Kees) - Fix syncmap memory leak (Matt B) - Add pretty printing for buddy allocator state debug (Matt A) - Fix potential error pointer dereference in pinned_context() (Dan) - Remove IS_ACTIVE macro (Lucas) - Static code checker fixes (Nathan) - Clean up disabled warnings (Nathan) - Increase timeout in i915_gem_contexts selftests 5x for GuC submission (Matt B) - Ensure wa_init_finish() is called for ctx workaround list (Matt R) - Initialize L3CC table in mocs init (Sreedhar, Ayaz, Ram) - Get PM ref before accessing HW register (Vinay) - Move __i915_gem_free_object to ttm_bo_destroy (Maarten) - Deduplicate frequency dump on debugfs (Lucas) - Make wa list per-gt (Venkata) - Do not define dummy vma in stack (Venkata) - Take pinning into account in __i915_gem_object_is_lmem (Matt B, Thomas) - Do not report currently active engine when describing objects (Tvrtko) - Fix pdfdocs build error by removing nested grid from GuC docs (Akira) - Remove false warning from the rps worker (Tejas) - Flush buffer pools on driver remove (Janusz) - Fix runtime pm handling in i915_gem_shrink (Maarten) - Rework TTM object initialization slightly (Thomas) - Use fixed offset for PTEs location (Michal Wa) - Verify result from CTB (de)register action and improve error messages (Michal Wa) - Fix bug in user proto-context creation that leaked contexts (Matt B) - Re-use Gen11 forcewake read functions on Gen12 (Matt R) - Make shadow tables range-based (Matt R) - Ditch the i915_gem_ww_ctx loop member (Thomas, Maarten) - Use NULL instead of 0 where appropriate (Ville) - Rename pci/debugfs functions to respect file prefix (Jani, Lucas) - Drop guc_communication_enabled (Daniele) - Selftest fixes (Thomas, Daniel, Matt A, Maarten) - Clean up inconsistent indenting (Colin) - Use direction definition DMA_BIDIRECTIONAL instead of PCI_DMA_BIDIRECTIONAL (Cai) - Add "intel_" as prefix in set_mocs_index() (Ayaz) From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YWAO80MB2eyToYoy@jlahtine-mobl.ger.corp.intel.com Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-10-01drm/i915: Use fixed offset for PTEs locationMichal Wajdeczko
We assumed that for all modern GENs the PTEs and register space are split in the GTTMMADR BAR, but while it is true, we should rather use fixed offset as it is defined in the specification. Bspec: 4409, 4457, 4604, 11181, 9027, 13246, 13321, 44980 Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: CQ Tang <cq.tang@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210926201005.1450-1-michal.wajdeczko@intel.com
2021-09-24drm/i915: Reduce the number of objects subject to memcpy recoverThomas Hellström
We really only need memcpy restore for objects that affect the operability of the migrate context. That is, primarily the page-table objects of the migrate VM. Add an object flag, I915_BO_ALLOC_PM_EARLY for objects that need early restores using memcpy and a way to assign LMEM page-table object flags to be used by the vms. Restore objects without this flag with the gpu blitter and only objects carrying the flag using TTM memcpy. Initially mark the migrate, gt, gtt and vgpu vms to use this flag, and defer for a later audit which vms actually need it. Most importantly, user- allocated vms with pinned page-table objects can be restored using the blitter. Performance-wise memcpy restore is probably as fast as gpu restore if not faster, but using gpu restore will help tackling future restrictions in mappable LMEM size. v4: - Don't mark the aliasing ppgtt page table flags for early resume, but rather the ggtt page table flags as intended. (Matthew Auld) - The check for user buffer objects during early resume is pointless, since they are never marked I915_BO_ALLOC_PM_EARLY. (Matthew Auld) v5: - Mark GuC LMEM objects with I915_BO_ALLOC_PM_EARLY to have them restored before we fire up the migrate context. Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-8-thomas.hellstrom@linux.intel.com
2021-09-23drm/i915/adlp: Add support for remapping CCS FBsImre Deak
Add support for remapping CCS FBs on ADL-P to remove the restriction of the power-of-two sized stride and the 2MB surface offset alignment for these FBs. We can only remap the tiles on the main surface, not the tiles on the CCS surface, so userspace has to generate the CCS surface aligning to the POT size padded main surface stride (by programming the AUX pagetable accordingly). For the required AUX pagetable setup, this requires that either the main surface stride is 8 tiles or that the stride is 16 tiles aligned (= 64 kbytes, the area mapped by one AUX PTE). v2: - Init intel_remapped_info::plane_alignment only for remapped views and do this from intel_fb_view_init(). Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210906182715.3915100-6-imre.deak@intel.com
2021-09-06drm/i915: Stop rcu support for i915_address_spaceDaniel Vetter
The full audit is quite a bit of work: - i915_dpt has very simple lifetime (somehow we create a display pagetable vm per object, so its _very_ simple, there's only ever a single vma in there), and uses i915_vm_close(), which internally does a i915_vm_put(). No rcu. Aside: wtf is i915_dpt doing in the intel_display.c garbage collector as a new feature, instead of added as a separate file with some clean-ish interface. Also, i915_dpt unfortunately re-introduces some coding patterns from pre-dma_resv_lock conversion times. - i915_gem_proto_ctx is fully refcounted and no rcu, all protected by fpriv->proto_context_lock. - i915_gem_context is itself rcu protected, and that might leak to anything it points at. Before commit cf977e18610e66e48c31619e7e0cfa871be9eada Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Dec 2 11:21:40 2020 +0000 drm/i915/gem: Spring clean debugfs and commit db80a1294c231b6ac725085f046bb2931e00c9db Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jan 18 11:08:54 2021 +0000 drm/i915/gem: Remove per-client stats from debugfs/i915_gem_objects we had a bunch of debugfs files that relied on rcu protecting everything, but those are gone now. The main one was removed even earlier with There doesn't seem to be anything left that's actually protecting stuff now that the ctx->vm itself is invariant. See commit ccbc1b97948ab671335e950271e39766729736c3 Author: Jason Ekstrand <jason@jlekstrand.net> Date: Thu Jul 8 10:48:30 2021 -0500 drm/i915/gem: Don't allow changing the VM on running contexts (v4) Note that we drop the vm refcount before the final release of the gem context refcount, so this is all very dangerous even without rcu. Note that aside from later on creating new engines (a defunct feature) and debug output we're never looked at gem_ctx->vm for anything functional, hence why this is ok. Fingers crossed. Preceeding patches removed all vestiges of rcu use from gem_ctx->vm derferencing to make it clear it's really not used. The gem_ctx->rcu protection was introduced in commit a4e7ccdac38ec8335d9e4e2656c1a041c77feae1 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Oct 4 14:40:09 2019 +0100 drm/i915: Move context management under GEM The commit message is somewhat entertaining because it fails to mention this fact completely, and compensates that by an in-commit changelog entry that claims that ctx->vm is protected by ctx->mutex. Which was the case _before_ this commit, but no longer after it. - intel_context holds a full reference. Unfortunately intel_context is also rcu protected and the reference to the ->vm is dropped before the rcu barrier - only the kfree is delayed. So again we need to check whether that leaks anywhere on the intel_context->vm. RCU is only used to protect intel_context sitting on the breadcrumb lists, which don't look at the vm anywhere, so we are fine. Nothing else relies on rcu protection of intel_context and hence is fully protected by the kref refcount alone, which protects intel_context->vm in turn. The breadcrumbs rcu usage was added in commit c744d50363b714783bbc88d986cc16def13710f7 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Nov 26 14:04:06 2020 +0000 drm/i915/gt: Split the breadcrumb spinlock between global and contexts its parent commit added the intel_context rcu protection: commit 14d1eaf08845c534963c83f754afe0cb14cb2512 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Nov 26 14:04:05 2020 +0000 drm/i915/gt: Protect context lifetime with RCU given some credence to my claim that I've actually caught them all. - drm_i915_gem_object's shares_resv_from pointer has a full refcount to the dma_resv, which is a sub-refcount that's released after the final i915_vm_put() has been called. Safe. Aside: Maybe we should have a struct dma_resv_shared which is just dma_resv + kref as a stand-alone thing. It's a pretty useful pattern which other drivers might want to copy. For a bit more context see commit 4d8151ae5329cf50781a02fd2298a909589a5bab Author: Thomas Hellström <thomas.hellstrom@linux.intel.com> Date: Tue Jun 1 09:46:41 2021 +0200 drm/i915: Don't free shared locks while shared - the fpriv->vm_xa was relying on rcu_read_lock for lookup, but that was updated in a prep patch too to just be a spinlock-protected lookup. - intel_gt->vm is set at driver load in intel_gt_init() and released in intel_gt_driver_release(). There seems to be some issue that in some error paths this is called twice, but otherwise no rcu to be found anywhere. This was added in the below commit, which unfortunately doesn't explain why this complication exists. commit e6ba76480299a0d77c51d846f7467b1673aad25b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Dec 21 16:03:24 2019 +0000 drm/i915: Remove i915->kernel_context The proper fix most likely for this is to start using drmm_ at large scale, but that's also huge amounts of work. - i915_vma->vm is some real pain, because rcu is rcu protected, at least in the vma lookup in the context lookup cache in eb_lookup_vma(). This was added in commit 4ff4b44cbb70c269259958cbcc48d7b8a2cb9ec8 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jun 16 15:05:16 2017 +0100 drm/i915: Store a direct lookup from object handle to vma This was changed to a radix tree from the hashtable in, but with the locking unchanged, in commit d1b48c1e7184d9bc4ae6d7f9fe2eed9efed11ffc Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Aug 16 09:52:08 2017 +0100 drm/i915: Replace execbuf vma ht with an idr In commit 93159e12353c2a47e5576d642845a91fa00530bf Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Mar 23 09:28:41 2020 +0000 drm/i915/gem: Avoid gem_context->mutex for simple vma lookup the locking was changed from dev->struct_mutex to rcu, which added the requirement to rcu protect i915_vma. Somehow this was missed in review (or I'm completely blind). Irrespective of all that the vma lookup cache rcu_read_lock grabs a full reference of the vma and the rcu doesn't leak further. So no impact on i915_address_space from that. I have not found any other rcu use for i915_vma, but given that it seems broken I also didn't bother to do a careful in-depth audit. Alltogether there's nothing left in-tree anymore which requires that a pointer deref to an i915_address_space is safe undre rcu_read_lock only. rcu protection of i915_address_space was introduced in commit b32fa811156328aea5a3c2ff05cc096490382456 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jun 20 19:37:05 2019 +0100 drm/i915/gtt: Defer address space cleanup to an RCU worker by mixing up a bugfixing (i915_address_space needs to be released from a worker) with enabling rcu support. The commit message also seems somewhat confused, because it talks about cleanup of WC pages requiring sleep, while the code and linked bugzilla are about a requirement to take dev->struct_mutex (which yes sleeps but it's a much more specific problem). Since final kref_put can be called from pretty much anywhere (including hardirq context through the scheduler's i915_active cleanup) we need a worker here. Hence that part must be kept. Ideally all these reclaim workers should have some kind of integration with our shrinkers, but for some of these it's rather tricky. Anyway, that's a preexisting condition in the codeebase that we wont fix in this patch here. We also remove the rcu_barrier in ggtt_cleanup_hw added in commit 60a4233a4952729089e4df152e730f8f4d0e82ce Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jul 29 14:24:12 2019 +0100 drm/i915: Flush the i915_vm_release before ggtt shutdown Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-11-daniel.vetter@ffwll.ch
2021-07-29drm/i915/gt: remove GRAPHICS_VER == 10Lucas De Marchi
Replace all remaining handling of GRAPHICS_VER {==,>=} 10 with {==,>=} 11. With the removal of CNL, there is no platform with graphics version equals 10. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210728220326.1578242-5-lucas.demarchi@intel.com
2021-07-16drm/i915: Remove allow_alloc from i915_gem_object_get_sg*Jason Ekstrand
This reverts the rest of 0edbb9ba1bfe ("drm/i915: Move cmd parser pinning to execbuffer"). Now that the only user of i915_gem_object_get_sg without allow_alloc has been removed, we can drop the parameter. This portion of the revert was broken into its own patch to aid review. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-4-jason@jlekstrand.net
2021-06-05drm/i915/gt: replace IS_GEN and friends with GRAPHICS_VERLucas De Marchi
This was done by the following semantic patch: @@ expression i915; @@ - INTEL_GEN(i915) + GRAPHICS_VER(i915) @@ expression i915; expression E; @@ - INTEL_GEN(i915) >= E + GRAPHICS_VER(i915) >= E @@ expression dev_priv; expression E; @@ - !IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) != E @@ expression dev_priv; expression E; @@ - IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) == E @@ expression dev_priv; expression from, until; @@ - IS_GEN_RANGE(dev_priv, from, until) + IS_GRAPHICS_VER(dev_priv, from, until) @def@ expression E; identifier id =~ "^gen$"; @@ - id = GRAPHICS_VER(E) + ver = GRAPHICS_VER(E) @@ identifier def.id; @@ - id + ver It also takes care of renaming the variable we assign to GRAPHICS_VER() so to use "ver" rather than "gen". Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-2-lucas.demarchi@intel.com
2021-06-02Merge drm/drm-next into drm-intel-gt-nextJoonas Lahtinen
Pulling in -rc2 fixes and TTM changes that next upcoming patches depend on. Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2021-06-02Merge tag 'drm-intel-gt-next-2021-05-28' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next UAPI Changes: - Add reworked uAPI for DG1 behind CONFIG_BROKEN (Matt A, Abdiel) Driver Changes: - Fix for Gitlab issues #3293 and #3450: Avoid kernel crash on older L-shape memory machines - Add Wa_14010733141 (VDBox SFC reset) for Gen11+ (Aditya) - Fix crash in auto_retire active retire callback due to misalignment (Stephane) - Fix overlay active retire callback alignment (Tvrtko) - Eliminate need to align active retire callbacks (Matt A, Ville, Daniel) - Program FF_MODE2 tuning value for all Gen12 platforms (Caz) - Add Wa_14011060649 for TGL,RKL,DG1 and ADLS (Swathi) - Create stolen memory region from local memory on DG1 (CQ) - Place PD in LMEM on dGFX (Matt A) - Use WC when default state object is allocated in LMEM (Venkata) - Determine the coherent map type based on object location (Venkata) - Use lmem physical addresses for fb_mmap() on discrete (Mohammed) - Bypass aperture on fbdev when LMEM is available (Anusha) - Return error value when displayable BO not in LMEM for dGFX (Mohammed) - Do release kernel context if breadcrumb measure fails (Janusz) - Hide modparams for compiled-out features (Tvrtko) - Apply Wa_22010271021 for all Gen11 platforms (Caz) - Fix unlikely ref count race in arming the watchdog timer (Tvrtko) - Check actual RC6 enable status in PMU (Tvrtko) - Fix a double free in gen8_preallocate_top_level_pdp (Lv) - Use trylock in shrinker for GGTT on BSW VT-d and BXT (Maarten) - Remove erroneous i915_is_ggtt check for I915_GEM_OBJECT_UNBIND_VM_TRYLOCK (Maarten) - Convert uAPI headers to real kerneldoc (Matt A) - Clean up kerneldoc warnings headers (Matt A, Maarten) - Fail driver if LMEM training failed (Matt R) - Avoid div-by-zero on Gen2 (Ville) - Read C0DRB3/C1DRB3 as 16 bits again and add _BW suffix (Ville) - Remove reference to struct drm_device.pdev (Thomas) - Increase separation between GuC and execlists code (Chris, Matt B) - Use might_alloc() (Bernard) - Split DGFX_FEATURES from GEN12_FEATURES (Lucas) - Deduplicate Wa_22010271021 programming on (Jose) - Drop duplicate WaDisable4x2SubspanOptimization:hsw (Tvrtko) - Selftest improvements (Chris, Hsin-Yi, Tvrtko) - Shuffle around init_memory_region for stolen (Matt) - Typo fixes (wengjianfeng) [airlied: fix conflict with fixes in i915_active.c] Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YLCbBR22BsQ/dpJB@jlahtine-mobl.ger.corp.intel.com
2021-06-01drm/i915: Don't free shared locks while sharedThomas Hellström
We are currently sharing the VM reservation locks across a number of gem objects with page-table memory. Since TTM will individiualize the reservation locks when freeing objects, including accessing the shared locks, make sure that the shared locks are not freed until that is done. For PPGTT we add an additional refcount, for GGTT we take additional measures to make sure objects sharing the GGTT reservation lock are freed at GGTT takedown Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210601074654.3103-3-thomas.hellstrom@linux.intel.com
2021-05-07drm/i915/xelpd: First stab at DPT supportVille Syrjälä
Add support for DPT (display page table). DPT is a slightly peculiar two level page table scheme used for tiled scanout buffers (linear uses direct ggtt mapping still). The plane surface address will point at a page in the DPT which holds the PTEs for 512 actual pages. Thus we require 1/512 of the ggttt address space compared to a direct ggtt mapping. We create a new DPT address space for each framebuffer and track two vmas (one for the DPT, another for the ggtt). TODO: - Is the i915_address_space approaach sane? - Maybe don't map the whole DPT to write the PTEs? - Deal with remapping/rotation? Need to create a separate DPT for each remapped/rotated plane I guess. Or else we'd need to make the per-fb DPT large enough to support potentially several remapped/rotated vmas. How large should that be? Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Cc: Wilson Chris P <Chris.P.Wilson@intel.com> Cc: Tang CQ <cq.tang@intel.com> Cc: Auld Matthew <matthew.auld@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Reviewed-by: Wilson Chris P <Chris.P.Wilson@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210506161930.309688-5-imre.deak@intel.com
2021-04-29drm/i915: Use trylock in shrinker for ggtt on bsw vt-d and bxt, v2.Maarten Lankhorst
The stop_machine() lock may allocate memory, but is called inside vm->mutex, which is taken in the shrinker. This will cause a lockdep splat, as can be seen below: <4>[ 462.585762] ====================================================== <4>[ 462.585768] WARNING: possible circular locking dependency detected <4>[ 462.585773] 5.12.0-rc5-CI-Trybot_7644+ #1 Tainted: G U <4>[ 462.585779] ------------------------------------------------------ <4>[ 462.585783] i915_selftest/5540 is trying to acquire lock: <4>[ 462.585788] ffffffff826440b0 (cpu_hotplug_lock){++++}-{0:0}, at: stop_machine+0x12/0x30 <4>[ 462.585814] but task is already holding lock: <4>[ 462.585818] ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915] <4>[ 462.586301] which lock already depends on the new lock. <4>[ 462.586305] the existing dependency chain (in reverse order) is: <4>[ 462.586309] -> #2 (&vm->mutex/1){+.+.}-{3:3}: <4>[ 462.586323] i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915] <4>[ 462.586719] i915_address_space_init+0x12d/0x130 [i915] <4>[ 462.587092] ppgtt_init+0x4e/0x80 [i915] <4>[ 462.587467] gen8_ppgtt_create+0x3e/0x5c0 [i915] <4>[ 462.587828] i915_ppgtt_create+0x28/0xf0 [i915] <4>[ 462.588203] intel_gt_init+0x123/0x370 [i915] <4>[ 462.588572] i915_gem_init+0x129/0x1f0 [i915] <4>[ 462.588971] i915_driver_probe+0x753/0xd80 [i915] <4>[ 462.589320] i915_pci_probe+0x43/0x1d0 [i915] <4>[ 462.589671] pci_device_probe+0x9e/0x110 <4>[ 462.589680] really_probe+0xea/0x410 <4>[ 462.589690] driver_probe_device+0xd9/0x140 <4>[ 462.589697] device_driver_attach+0x4a/0x50 <4>[ 462.589704] __driver_attach+0x83/0x140 <4>[ 462.589711] bus_for_each_dev+0x75/0xc0 <4>[ 462.589718] bus_add_driver+0x14b/0x1f0 <4>[ 462.589724] driver_register+0x66/0xb0 <4>[ 462.589731] i915_init+0x70/0x87 [i915] <4>[ 462.590053] do_one_initcall+0x56/0x2e0 <4>[ 462.590061] do_init_module+0x55/0x200 <4>[ 462.590068] load_module+0x2703/0x2990 <4>[ 462.590074] __do_sys_finit_module+0xad/0x110 <4>[ 462.590080] do_syscall_64+0x33/0x80 <4>[ 462.590089] entry_SYSCALL_64_after_hwframe+0x44/0xae <4>[ 462.590096] -> #1 (fs_reclaim){+.+.}-{0:0}: <4>[ 462.590109] fs_reclaim_acquire+0x9f/0xd0 <4>[ 462.590118] kmem_cache_alloc_trace+0x3d/0x430 <4>[ 462.590126] intel_cpuc_prepare+0x3b/0x1b0 <4>[ 462.590133] cpuhp_invoke_callback+0x9e/0x890 <4>[ 462.590141] _cpu_up+0xa4/0x130 <4>[ 462.590147] cpu_up+0x82/0x90 <4>[ 462.590153] bringup_nonboot_cpus+0x4a/0x60 <4>[ 462.590159] smp_init+0x21/0x5c <4>[ 462.590167] kernel_init_freeable+0x8a/0x1b7 <4>[ 462.590175] kernel_init+0x5/0xff <4>[ 462.590181] ret_from_fork+0x22/0x30 <4>[ 462.590187] -> #0 (cpu_hotplug_lock){++++}-{0:0}: <4>[ 462.590199] __lock_acquire+0x1520/0x2590 <4>[ 462.590207] lock_acquire+0xd1/0x3d0 <4>[ 462.590213] cpus_read_lock+0x39/0xc0 <4>[ 462.590219] stop_machine+0x12/0x30 <4>[ 462.590226] bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915] <4>[ 462.590601] ggtt_bind_vma+0x5d/0x80 [i915] <4>[ 462.590970] i915_vma_bind+0xdc/0x1c0 [i915] <4>[ 462.591374] i915_vma_pin_ww+0x435/0xb40 [i915] <4>[ 462.591779] make_obj_busy+0xcb/0x330 [i915] <4>[ 462.592170] igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915] <4>[ 462.592562] __i915_subtests.cold.7+0x42/0x92 [i915] <4>[ 462.592995] __run_selftests.part.3+0x10d/0x172 [i915] <4>[ 462.593428] i915_live_selftests.cold.5+0x1f/0x47 [i915] <4>[ 462.593860] i915_pci_probe+0x93/0x1d0 [i915] <4>[ 462.594210] pci_device_probe+0x9e/0x110 <4>[ 462.594217] really_probe+0xea/0x410 <4>[ 462.594226] driver_probe_device+0xd9/0x140 <4>[ 462.594233] device_driver_attach+0x4a/0x50 <4>[ 462.594240] __driver_attach+0x83/0x140 <4>[ 462.594247] bus_for_each_dev+0x75/0xc0 <4>[ 462.594254] bus_add_driver+0x14b/0x1f0 <4>[ 462.594260] driver_register+0x66/0xb0 <4>[ 462.594267] i915_init+0x70/0x87 [i915] <4>[ 462.594586] do_one_initcall+0x56/0x2e0 <4>[ 462.594592] do_init_module+0x55/0x200 <4>[ 462.594599] load_module+0x2703/0x2990 <4>[ 462.594605] __do_sys_finit_module+0xad/0x110 <4>[ 462.594612] do_syscall_64+0x33/0x80 <4>[ 462.594618] entry_SYSCALL_64_after_hwframe+0x44/0xae <4>[ 462.594625] other info that might help us debug this: <4>[ 462.594629] Chain exists of: cpu_hotplug_lock --> fs_reclaim --> &vm->mutex/1 <4>[ 462.594645] Possible unsafe locking scenario: <4>[ 462.594648] CPU0 CPU1 <4>[ 462.594652] ---- ---- <4>[ 462.594655] lock(&vm->mutex/1); <4>[ 462.594664] lock(fs_reclaim); <4>[ 462.594671] lock(&vm->mutex/1); <4>[ 462.594679] lock(cpu_hotplug_lock); <4>[ 462.594686] *** DEADLOCK *** <4>[ 462.594690] 4 locks held by i915_selftest/5540: <4>[ 462.594696] #0: ffff888100fbc240 (&dev->mutex){....}-{3:3}, at: device_driver_attach+0x18/0x50 <4>[ 462.594715] #1: ffffc900006cb9a0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: make_obj_busy+0x81/0x330 [i915] <4>[ 462.595118] #2: ffff88812a6081e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: make_obj_busy+0x21f/0x330 [i915] <4>[ 462.595519] #3: ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915] <4>[ 462.595934] stack backtrace: <4>[ 462.595939] CPU: 0 PID: 5540 Comm: i915_selftest Tainted: G U 5.12.0-rc5-CI-Trybot_7644+ #1 <4>[ 462.595947] Hardware name: GOOGLE Kefka/Kefka, BIOS MrChromebox 02/04/2018 <4>[ 462.595952] Call Trace: <4>[ 462.595961] dump_stack+0x7f/0xad <4>[ 462.595974] check_noncircular+0x12e/0x150 <4>[ 462.595982] ? save_stack.isra.17+0x3f/0x70 <4>[ 462.595991] ? drm_mm_insert_node_in_range+0x34a/0x5b0 <4>[ 462.596000] ? i915_vma_pin_ww+0x9ec/0xb40 [i915] <4>[ 462.596410] __lock_acquire+0x1520/0x2590 <4>[ 462.596419] ? do_init_module+0x55/0x200 <4>[ 462.596429] lock_acquire+0xd1/0x3d0 <4>[ 462.596435] ? stop_machine+0x12/0x30 <4>[ 462.596445] ? gen8_ggtt_insert_entries+0xf0/0xf0 [i915] <4>[ 462.596816] cpus_read_lock+0x39/0xc0 <4>[ 462.596824] ? stop_machine+0x12/0x30 <4>[ 462.596831] stop_machine+0x12/0x30 <4>[ 462.596839] bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915] <4>[ 462.597210] ggtt_bind_vma+0x5d/0x80 [i915] <4>[ 462.597580] i915_vma_bind+0xdc/0x1c0 [i915] <4>[ 462.597986] i915_vma_pin_ww+0x435/0xb40 [i915] <4>[ 462.598395] ? make_obj_busy+0xcb/0x330 [i915] <4>[ 462.598786] make_obj_busy+0xcb/0x330 [i915] <4>[ 462.599180] ? 0xffffffff81000000 <4>[ 462.599187] ? debug_mutex_unlock+0x50/0xa0 <4>[ 462.599198] igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915] <4>[ 462.599592] __i915_subtests.cold.7+0x42/0x92 [i915] <4>[ 462.600026] ? i915_perf_selftests+0x20/0x20 [i915] <4>[ 462.600422] ? __i915_nop_setup+0x10/0x10 [i915] <4>[ 462.600820] __run_selftests.part.3+0x10d/0x172 [i915] <4>[ 462.601253] i915_live_selftests.cold.5+0x1f/0x47 [i915] <4>[ 462.601686] i915_pci_probe+0x93/0x1d0 [i915] <4>[ 462.602037] ? _raw_spin_unlock_irqrestore+0x3d/0x60 <4>[ 462.602047] pci_device_probe+0x9e/0x110 <4>[ 462.602057] really_probe+0xea/0x410 <4>[ 462.602067] driver_probe_device+0xd9/0x140 <4>[ 462.602075] device_driver_attach+0x4a/0x50 <4>[ 462.602084] __driver_attach+0x83/0x140 <4>[ 462.602091] ? device_driver_attach+0x50/0x50 <4>[ 462.602099] ? device_driver_attach+0x50/0x50 <4>[ 462.602107] bus_for_each_dev+0x75/0xc0 <4>[ 462.602116] bus_add_driver+0x14b/0x1f0 <4>[ 462.602124] driver_register+0x66/0xb0 <4>[ 462.602133] i915_init+0x70/0x87 [i915] <4>[ 462.602453] ? 0xffffffffa0606000 <4>[ 462.602458] do_one_initcall+0x56/0x2e0 <4>[ 462.602466] ? kmem_cache_alloc_trace+0x374/0x430 <4>[ 462.602476] do_init_module+0x55/0x200 <4>[ 462.602484] load_module+0x2703/0x2990 <4>[ 462.602500] ? __do_sys_finit_module+0xad/0x110 <4>[ 462.602507] __do_sys_finit_module+0xad/0x110 <4>[ 462.602519] do_syscall_64+0x33/0x80 <4>[ 462.602527] entry_SYSCALL_64_after_hwframe+0x44/0xae <4>[ 462.602535] RIP: 0033:0x7fab69d8d89d Changes since v1: - Add lockdep annotations during init, to ensure that lockdep is primed. This also fixes a false positive when reading /proc/lockdep_stats during module reload. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210426102351.921874-1-maarten.lankhorst@linux.intel.com Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2021-04-27drm/i915/gtt: map the PD up frontMatthew Auld
We need to generalise our accessor for the page directories and tables from using the simple kmap_atomic to support local memory, and this setup must be done on acquisition of the backing storage prior to entering fence execution contexts. Here we replace the kmap with the object mapping code that for simple single page shmemfs object will return a plain kmap, that is then kept for the lifetime of the page directory. Note that keeping the mapping around is a potential concern here, since while the vma is pinned the mapping remains there for the PDs underneath, or at least until the used_count reaches zero, at which point we can safely destroy the mapping. For 32b this will be even worse since the address space is more limited, but since this change mostly impacts full ppGTT platforms, the justification is that for modern platforms we shouldn't care too much about 32b. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210427085417.120246-3-matthew.auld@intel.com
2021-04-08Merge tag 'drm-intel-next-2021-04-01' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Features: - Add support for FBs requiring a power-of-two stride padding (Imre) Refactoring: - Disassociate display version from gen (Matt) - Refactor legacy DP and HDMI code to separate files (Ville) - Refactor FB plane code to a separate file (Imre) - Refactor VBT child device info parsing and usage (Jani) - Refactor KBL/TGL/ADL-S display and gt stepping schemes (Jani) Fixes: - DP Link-Training Tunable PHY Repeaters (LTTPR) fixes (Imre) - HDCP fixes (Anshuman) - DP 2.0 HDMI 2.1 PCON Fixed Rate Link (FRL) fixes (Ankit) - Set HDA link parameters in driver (Kai) - Fix enabled_planes bitmask (Ville) - Fix transposed arguments to skl_plane_wm_level() (Ville) - Stop adding planes to the commit needlessly (Ville) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87v996ml17.fsf@intel.com
2021-04-08Merge tag 'drm-intel-gt-next-2021-04-06' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Driver Changes: - Prepare for local/device memory support on DG1 by starting to use it for kernel internal allocations: context, ring and engine scratch (Matt A, CQ, Abdiel, Imre) - Sandybridge fix to avoid hard hang on ring resume (Chris) - Limit imported dma-buf size to int32 (Matt A) - Double check heartbeat timeout before resetting (Chris) - Use new tasklet API for execution list (Emil) - Fix SPDX checkpats warnings (Chris) - Fixes for various checkpatch warnings (Chris) - Selftest improvements (Chris) - Move the defer_request waiter active assertion to correct spot (Chris) - Make local-memory probing a GT operation (Matt, Tvrtko) - Protect against request freeing during cancellation on wedging (Chris) - Retire unexpected starting state error dumping (Chris) - Distinction of memory regions in debugging (Zbigniew) - Always flush the submission queue on checking for idle (Chris) - Consolidate 2big error check to helper (Matt) - Decrease number of subplatform bits (Tvrtko) - Remove unused internal request priority levels (Chris) - Document the unused internal header bits in buddy allocator (Matt) - Cleanup the region class/instance encoding (Matt) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YGxksaZGXHnFxlwg@jlahtine-mobl.ger.corp.intel.com
2021-03-29drm/i915: Add support for FBs requiring a POT stride alignmentImre Deak
An upcoming platform has a restriction that the FB stride must be power-of-two aligned. To support framebuffer layouts that are not in this layout add a logic that pads the tile rows to the POT aligned size. The HW won't read the padding PTEs, so these don't have to point to an allocated address, or even have their valid flag set. So use a NULL PTE instead for instance the scratch page, which is simple and keeps the SG table compact. v2: - Simplify plane_view_dst_stride(). (Ville) - Pass pitch_tiles as unsigned int. v3: - Drop unintentional s/plane_state->rotation/plane_config->rotation/ change. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210325214808.2071517-24-imre.deak@intel.com
2021-03-29drm/i915: s/stride/src_stride/ in the intel_remapped_plane_info structImre Deak
An upcoming patch adds a new dst_stride field to the intel_remapped_plane_info struct, so for clarity rename the current stride field to src_stride. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210325214808.2071517-23-imre.deak@intel.com
2021-03-24drm/i915/gtt/dg1: add PTE_LM plumbing for GGTTMatthew Auld
For the PTEs we get an LM bit, to signal whether the page resides in SMEM or LMEM. Based on a patch from Michel Thierry. BSpec: 45015 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20210203171231.551338-3-matthew.auld@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-03-24drm/i915/gt: Remove repeated words from commentsChris Wilson
Checkpatch spotted a few repeated words in the comment, genuine mistakes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210122192913.4518-3-chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-03-24drm/i915: Use a single page table lock for each gtt.Maarten Lankhorst
We may create page table objects on the fly, but we may need to wait with the ww lock held. Instead of waiting on a freed obj lock, ensure we have the same lock for each object to keep -EDEADLK working. This ensures that i915_vma_pin_ww can lock the page tables when required. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-41-maarten.lankhorst@linux.intel.com
2021-03-24drm/i915: Move cmd parser pinning to execbufferMaarten Lankhorst
We need to get rid of allocations in the cmd parser, because it needs to be called from a signaling context, first move all pinning to execbuf, where we already hold all locks. Allocate jump_whitelist in the execbuffer, and add annotations around intel_engine_cmd_parser(), to ensure we only call the command parser without allocating any memory, or taking any locks we're not supposed to. Because i915_gem_object_get_page() may also allocate memory, add a path to i915_gem_object_get_sg() that prevents memory allocations, and walk the sg list manually. It should be similarly fast. This has the added benefit of being able to catch all memory allocation errors before the point of no return, and return -ENOMEM safely to the execbuf submitter. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-4-maarten.lankhorst@linux.intel.com
2021-03-11Merge drm/drm-next into drm-intel-nextJani Nikula
Sync up with upstream. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-02-21Merge tag 'drm-next-2021-02-19' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm updates from Dave Airlie: "A pretty normal tree, lots of refactoring across the board, ttm, i915, nouveau, and bunch of features in various drivers. docs: - lots of updated docs core: - require crtc to have unique primary plane - fourcc macro fix - PCI bar quirk for bar resizing - don't sent hotplug on error - move vm code to legacy - nuke hose only used on old oboslete alpha dma-buf: - kernel doc updates - improved lock tracking dp/hdmi: - DP-HDMI2.1 protocol converter support ttm: - bo size handling cleanup - release a pinned bo warning - cleanup lru handler - avoid using pages with drm_prime_sg_to_page_addr_arrays cma-helper: - prime/mmap fixes bridge: - add DP support gma500: - remove gma3600 support i915: - try eDP fast/narrow link again with fallback - Intel eDP backlight control - replace display register read/write macros - refactor intel_display.c - display power improvements - HPD code cleanup - Rocketlake display fixes - Power/backlight/RPM fixes - DG1 display fix - IVB/BYT clear residuals security fix again - make i915 mitigations options via parameter - HSW GT1 GPU hangs fixes - DG1 workaround hang fixes - TGL DMAR hang avoidance - Lots of GT fixes - follow on fixes for residuals clear - gen7 per-engine-reset support - HDCP2.2 + HDCP1.4 GEN12 DP MST support - TGL clear color support - backlight refactoring - VRR/Adaptive sync enabling on DP/EDP for TGL+ - async flips for all ilk+ amdgpu: - rework IH ring handling (Vega/Navi) - rework HDP handling (Vega/Navi) - swSMU updates for renoir/vangogh - Sienna Cichild overdrive support - FP16 on DCE8-11 support - GPU reset on navy flounder/vangogh - SMU profile fixes for APU - SR-IOV fixes - Vangogh SMU fixes - fan speed control fixes amdkfd: - config handling fix - buffer free fix - recursive lock warnings fix nouveau: - Turing MMU fault recovery fixes - mDP connectors reporting fix - audio locking fixes - rework engines/instances code to support new scheme tegra: - VIC newer firmware support - display/gr2d fixes for older tegra - pm reference leak fix mediatek: - SOC MT8183 support - decouple sub driver + share mtk mutex driver radeon: - PCI resource fix for some platforms ingenic: - pm support - 8-bit delta RGB panels vmwgfx: - managed driver helpers vc4: - BCM2711 DSI1 support - converted to atomic helpers - enable 10/12 bpc outputs - gem prime mmap helpers - CEC fix omap: - use degamma table - CTM support - rework DSI support imx: - stack usage fixes - drm managed support - imx-tve clock provider leak fix - rcar-du: - default mode fixes - conversion to managed API hisilicon: - use simple encoder vkms: - writeback connector support d3: - BT2020 support" * tag 'drm-next-2021-02-19' of git://anongit.freedesktop.org/drm/drm: (1459 commits) drm/amdgpu: Set reference clock to 100Mhz on Renoir (v2) drm/radeon: OLAND boards don't have VCE drm/amdkfd: Fix recursive lock warnings drm/amd/display: Add FPU wrappers to dcn21_validate_bandwidth() drm/amd/display: Fix potential integer overflow drm/amdgpu/display: remove hdcp_srm sysfs on device removal drm/amdgpu: fix CGTS_TCC_DISABLE register offset on gfx10.3 drm/i915/gt: Correct surface base address for renderclear drm/i915: Disallow plane x+w>stride on ilk+ with X-tiling drm/nouveau/top/ga100: initial support drm/nouveau/top: add ioctrl/nvjpg drm/nouveau/privring: rename from ibus drm/nouveau/nvkm: remove nvkm_subdev.index drm/nouveau/nvkm: determine subdev id/order from layout drm/nouveau/vic: switch to instanced constructor drm/nouveau/sw: switch to instanced constructor drm/nouveau/sec2: switch to instanced constructor drm/nouveau/sec: switch to instanced constructor drm/nouveau/pm: switch to instanced constructor drm/nouveau/nvenc: switch to instanced constructor ...
2021-02-02drm/i915/gt: Remove references to struct drm_device.pdevThomas Zimmermann
Using struct drm_device.pdev is deprecated. Convert i915 to struct drm_device.dev. No functional changes. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210128133127.2311-3-tzimmermann@suse.de
2021-01-26drm/i915/gt: Always try to reserve GGTT address 0x0Chris Wilson
Since writing to address 0 is a very common mistake, let's try to avoid putting anything sensitive there. References: https://gitlab.freedesktop.org/drm/intel/-/issues/2989 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210125125033.23656-1-chris@chris-wilson.co.uk Cc: stable@vger.kernel.org (cherry picked from commit 56b429cc584c6ed8b895d8d8540959655db1ff73) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-12-09drm/i915: Sleep around performing iommu unmaps on TigerlakeChris Wilson
Tigerlake is plagued by spontaneous DMAR faults [reason 7, next page table ptr is invalid] which lead to GPU hangs. These faults occur when an iommu map is immediately reused. Adding further clflushes and barriers around either the GTT PTE or iommu PTE updates do not prevent the faults. So far the only effect has been from inducing a delay between reuse of the iommu on the GPU, and applying the delay at the iommu map allows for the smallest stable delay. Note that such a delay is hideous and clearly does not fix the root cause, and so should only be a bandaid until a complete solution is found. The delay was determined by running igt/gem_exec_fence/parallel in a loop for a few hours (unpatched MTBF is about 10s). We have also seen such DMAR fault [reason 7] errors on other platforms, notably gen9-gen11, but so far it has only been trivially and consistently reproduced on Tigerlake. v2: Leave a tell-tale to know when we apply the vt'd quirk, and as a reminder to remove it again. Hopefully. Testcase: igt/gem_exec_fence/parallel Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201209164008.5487-2-chris@chris-wilson.co.uk
2020-11-13Merge tag 'drm-intel-gt-next-2020-11-12-1' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Cross-subsystem Changes: - DMA mapped scatterlist fixes in i915 to unblock merging of https://lkml.org/lkml/2020/9/27/70 (Tvrtko, Tom) Driver Changes: - Fix for user reported issue #2381 (Graphical output stops with "switching to inteldrmfb from simple"): Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init (Ville, Chris) - Fix for Tigerlake (and earlier) to avoid spurious empty CSB events leading to hang (Chris, Bruce) - Delay execlist processing for Tigerlake to avoid hang (Chris) - Fix for Tigerlake RCS engine health check through heartbeat (Chris) - Fix for Tigerlake reserved MOCS entries (Ayaz, Chris) - Fix Media power gate sequence on Tigerlake (Rodrigo) - Enable eLLC caching of display buffers for SKL+ (Ville) - Support parsing of oversize batches on Gen9 (Matt, Chris) - Exclude low pages (128KiB) of stolen from use to avoid thrashing during reset (Chris) - Flush engines before Tigerlake breadcrumbs (Chris) - Use the local HWSP offset during submission (Chris) - Flush coherency domains on first set-domain-ioctl (Chris, Zbigniew) - Use the active reference on the vma while capturing to avoid use-after-free (Chris) - Fix MOCS PTE setting for gen9+ (Ville) - Avoid NULL dereference on IPS driver callback while unbinding i915 (Chris) - Avoid NULL dereference from PT/PD stash allocation error (Matt) - Hold request reference for canceling an active context (Chris) - Avoid infinite loop on x86-32 when mapping a lot of objects (Chris) - Disallow WC mappings when processor doesn't support them (Chris) - Return correct error in i915_gem_object_copy_blt() error path (Dan) - Return correct error in intel_context_create_request() error path (Maarten) - Tune down GuC communication enabled/disabled messages to debug (Jani) - Fix rebased commit "Remove i915_request.lock requirement for execution callbacks" (Chris) - Cancel outstanding work after disabling heartbeats on an engine (Chris) - Signal cancelled requests (Chris) - Retire cancelled requests on unload (Chris) - Scrub HW state on driver remove (Chris) - Undo forced context restores after trivial preemptions (Chris) - Handle PCI unbind in PMU code (Tvrtko) - Fix CPU hotplug with multiple GPUs in PMU code (Trtkko) - Correctly set SFC capability for video engines (Venkata) - Update GuC code to use firmware v49.0.1 (John, Matthew B., Daniele, Oscar, Michel, Rodrigo, Michal) - Improve GuC warnings on loading failure (John) - Avoid ownership race in buffer pool by clearing age (Chris) - Use MMIO to read CSB in case of failure (Chris, Mika) - Show engine properties in engine state dump to indicate changes (Chris, Joonas) - Break up error capture compression loops with cond_resched() (Chris) - Reduce GPU error capture mutex hold time to avoid khungtaskd (Chris) - Serialise debugfs i915_gem_objects with ctx->mutex (Chris) - Always test execution status on closing the context and close if not persistent (Chris) - Avoid mixing integer types during batch copies (Chris, Jared) - Skip over MI_NOOP when parsing to avoid overhead (Chris) - Hold onto an explicit ref to i915_vma_work.pinned (Chris) - Perform all asynchronous waits prior to marking payload start (Chris) - Pull phys pread/pwrite implementations to the backend (Matt) - Improve record of hung engines in error state (Tvrtko) - Allow backends to override pread implementation (Matt) - Reinforce LRC poisoning checks to confirm context survives execution (Chris) - Fix memory region max size calculation (Matt) - Fix order when adding blocks to memory region (Matt) - Eliminate unused intel_virtual_engine_get_sibling func (Chris) - Cleanup kasan warning for on-stack (unsigned long) casting (Chris) - Onion unwind for scratch page allocation failure (Chris) - Poison stolen pages before use (Chris) - Selftest improvements (Chris) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201112163407.GA20320@jlahtine-mobl.ger.corp.intel.com
2020-10-06drm/i915: Fix DMA mapped scatterlist lookupTvrtko Ursulin
As the previous patch fixed the places where we walk the whole scatterlist for DMA addresses, this patch fixes the random lookup functionality. To achieve this we have to add a second lookup iterator and add a i915_gem_object_get_sg_dma helper, to be used analoguous to existing i915_gem_object_get_sg_dma. Therefore two lookup caches are maintained per object and they are flushed at the same point for simplicity. (Strictly speaking the DMA cache should be flushed from i915_gem_gtt_finish_pages, but today this conincides with unsetting of the pages in general.) Partial VMA view is then fixed to use the new DMA lookup and properly query sg length. v2: * Checkpatch. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Tom Murphy <murphyt7@tcd.ie> Cc: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201006092508.1064287-2-tvrtko.ursulin@linux.intel.com
2020-10-05drm/i915: don't conflate is_dgfx with fake lmemLucas De Marchi
When using fake lmem for tests, we are overriding the setting in device info for dgfx devices. Current users of IS_DGFX() except one are correct. However, as we add support for DG1, we are going to use it in additional places to trigger dgfx-only code path. In future if we need we can use HAS_LMEM() instead of IS_DGFX() in the places that make sense to also contemplate fake lmem use. v2: update gen8_gmch_probe() to use HAS_LMEM(): we need to steal the mappable aperture later(which is fine since it doesn't exist on "DGFX"), and use it as a substitute for LMEMBAR. The !mappable aperture property is also useful since it exercises some other parts of the code too. (Matthew Auld) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201001063917.3133475-1-lucas.demarchi@intel.com
2020-09-09Merge tag 'drm-intel-gt-next-2020-09-07' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next (Same content as drm-intel-gt-next-2020-09-04-3, S-o-b's added) UAPI Changes: (- Potential implicit changes from WW locking refactoring) Cross-subsystem Changes: (- WW locking changes should align the i915 locking more with others) Driver Changes: - MAJOR: Apply WW locking across the driver (Maarten) - Reverts for 5 commits to make applying WW locking faster (Maarten) - Disable preparser around invalidations on Tigerlake for non-RCS engines (Chris) - Add missing dma_fence_put() for error case of syncobj timeline (Chris) - Parse command buffer earlier in eb_relocate(slow) to facilitate backoff (Maarten) - Pin engine before pinning all objects (Maarten) - Rework intel_context pinning to do everything outside of pin_mutex (Maarten) - Avoid tracking GEM context until registered (Cc: stable, Chris) - Provide a fastpath for waiting on vma bindings (Chris) - Fixes to preempt-to-busy mechanism (Chris) - Distinguish the virtual breadcrumbs from the irq breadcrumbs (Chris) - Switch to object allocations for page directories (Chris) - Hold context/request reference while breadcrumbs are active (Chris) - Make sure execbuffer always passes ww state to i915_vma_pin (Maarten) - Code refactoring to facilitate use of WW locking (Maarten) - Locking refactoring to use more granular locking (Maarten, Chris) - Support for multiple pinned timelines per engine (Chris) - Move complication of I915_GEM_THROTTLE to the ioctl from general code (Chris) - Make active tracking/vma page-directory stash work preallocated (Chris) - Avoid flushing submission tasklet too often (Chris) - Reduce context termination list iteration guard to RCU (Chris) - Reductions to locking contention (Chris) - Fixes for issues found by CI (Chris) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <jlahtine@jlahtine-mobl.ger.corp.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200907130039.GA27766@jlahtine-mobl.ger.corp.intel.com