summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-09-20drm/i915: Do not define vma on stackVenkata Sandeep Dhanalakota
Defining vma on stack can cause stack overflow, if vma gets populated with new fields. v2: (Daniel Vetter) - Add kerneldoc for new field Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210916162819.27848-2-matthew.brost@intel.com
2021-09-20drm/i915/gt: Add "intel_" as prefix in set_mocs_index()Ayaz A Siddiqui
Adding missing "intel_" prefix in set_mocs_index(). Fixes: b62aa57e3c78 ("drm/i915/gt: Add support of mocs propagation") Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210916062736.1733587-1-ayaz.siddiqui@intel.com
2021-09-20drm/i915: Make wa list per-gtVenkata Sandeep Dhanalakota
Support for multiple GT's within a single i915 device will be arriving soon. Since each GT may have its own fusing and require different workarounds, we need to make the GT workaround functions and multicast steering setup per-gt. Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210917170845.836358-1-matthew.d.roper@intel.com
2021-09-20Merge remote-tracking branch 'tip/locking/wwmutex' into drm-intel-gt-nextJoonas Lahtinen
Needed by Maarten's series "drm/i915: Short-term pinning and async eviction". Link: https://lists.freedesktop.org/archives/intel-gfx/2021-September/277870.html Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2021-09-18drm/i915: deduplicate frequency dump on debugfsLucas De Marchi
Although commit 9dd4b065446a ("drm/i915/gt: Move pm debug files into a gt aware debugfs") says it was moving debug files to gt/, the i915_frequency_info file was left behind and its implementation copied into drivers/gpu/drm/i915/gt/debugfs_gt_pm.c. Over time we had several patches having to change both places to keep them in sync (and some patches failing to do so). The initial idea was to remove i915_frequency_info, but there are user space tools using it. From a quick code search there are other scripts and test tools besides igt, so it's not simply updating igt to get rid of the older file. Here we export a function using drm_printer as parameter and make both show() implementations to call this same function. Aside from a few variable name differences, for i915_frequency_info this brings a few lines that were not previously printed: RP UP EI, RP UP THRESHOLD, RP DOWN THRESHOLD and RP DOWN EI. These came in as part of commit 9c878557b1eb ("drm/i915/gt: Use the RPM config register to determine clk frequencies"), which didn't change both places. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210918025754.1254705-4-lucas.demarchi@intel.com
2021-09-18drm/i915: rename debugfs_gt_pm filesLucas De Marchi
We shouldn't be using debugfs_ namespace for this functionality. Rename debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make functions, defines and structs follow suit. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210918025754.1254705-3-lucas.demarchi@intel.com
2021-09-18drm/i915: rename debugfs_engines filesLucas De Marchi
We shouldn't be using debugfs_ namespace for this functionality. Rename debugfs_engines.[ch] to intel_gt_engines_debugfs.[ch] and then make functions, defines and structs follow suit. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210918025754.1254705-2-lucas.demarchi@intel.com
2021-09-18drm/i915: rename debugfs_gt filesLucas De Marchi
We shouldn't be using debugfs_ namespace for this functionality. Rename debugfs_gt.[ch] to intel_gt_debugfs.[ch] and then make functions, defines and structs follow suit. While at it and since we are renaming the header, sort the includes alphabetically. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210918025754.1254705-1-lucas.demarchi@intel.com
2021-09-17kernel/locking: Add context to ww_mutex_trylock()Maarten Lankhorst
i915 will soon gain an eviction path that trylock a whole lot of locks for eviction, getting dmesg failures like below: BUG: MAX_LOCK_DEPTH too low! turning off the locking correctness validator. depth: 48 max: 48! 48 locks held by i915_selftest/5776: #0: ffff888101a79240 (&dev->mutex){....}-{3:3}, at: __driver_attach+0x88/0x160 #1: ffffc900009778c0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: i915_vma_pin.constprop.63+0x39/0x1b0 [i915] #2: ffff88800cf74de8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_vma_pin.constprop.63+0x5f/0x1b0 [i915] #3: ffff88810c7f9e38 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x1c4/0x9d0 [i915] #4: ffff88810bad5768 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_gem_evict_something+0x110/0x860 [i915] #5: ffff88810bad60e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_gem_evict_something+0x110/0x860 [i915] ... #46: ffff88811964d768 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_gem_evict_something+0x110/0x860 [i915] #47: ffff88811964e0e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_gem_evict_something+0x110/0x860 [i915] INFO: lockdep is turned off. Fixing eviction to nest into ww_class_acquire is a high priority, but it requires a rework of the entire driver, which can only be done one step at a time. As an intermediate solution, add an acquire context to ww_mutex_trylock, which allows us to do proper nesting annotations on the trylocks, making the above lockdep splat disappear. This is also useful in regulator_lock_nested, which may avoid dropping regulator_nesting_mutex in the uncontended path, so use it there. TTM may be another user for this, where we could lock a buffer in a fastpath with list locks held, without dropping all locks we hold. [peterz: rework actual ww_mutex_trylock() implementations] Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/YUBGPdDDjKlxAuXJ@hirez.programming.kicks-ass.net
2021-09-16drm/i915: Move __i915_gem_free_object to ttm_bo_destroyMaarten Lankhorst
When we implement delayed destroy, we may have a second call to the delete_mem_notify() handler, while free_object() only should be called once. Move it to bo->destroy(), to ensure it's only called once. This fixes some weird memory corruption issues with delayed destroy when async eviction is used. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210830121006.2978297-2-maarten.lankhorst@linux.intel.com Fixes: 213d50927763 ("drm/i915/ttm: Introduce a TTM i915 gem object backend") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2021-09-15drm/i915: Mark GPU wedging on driver unregister unrecoverableJanusz Krzysztofik
GPU wedged flag now set on driver unregister to prevent from further using the GPU can be then cleared unintentionally when calling __intel_gt_unset_wedged() still before the flag is finally marked unrecoverable. We need to have it marked unrecoverable earlier. Implement that by replacing a call to intel_gt_set_wedged() in intel_gt_driver_unregister() with intel_gt_set_wedged_on_fini(). With the above in place, intel_gt_set_wedged_on_fini() is now called twice on driver remove, second time from __intel_gt_disable(). This seems harmless, while dropping intel_gt_set_wedged_on_fini() from __intel_gt_disable() proved to break some driver probe error unwind paths as well as mock selftest exit path. Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210903142837.216978-1-janusz.krzysztofik@linux.intel.com
2021-09-15drm/i915: Add mmap lock around vma_lookup() in the mman selftest.Maarten Lankhorst
Add mmap_read_lock/unlock around vma_lookup(). The core code requires this for lookups. Since we only check if the return value is NULL, we can immediately unlock. This fixes the following splat in the selftes: i915: Running i915_gem_mman_live_selftests/igt_mmap ------------[ cut here ]------------ WARNING: CPU: 3 PID: 5654 at include/linux/mmap_lock.h:164 find_vma+0x4e/0xb0 Modules linked in: i915(+) vgem fuse snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_intel_dspcfg snd_hda_codec snd_hwdep e1000e snd_hda_core ptp snd_pcm ttm mei_me pps_core i2c_i801 prime_numbers i2c_smbus mei [last unloaded: i915] CPU: 3 PID: 5654 Comm: i915_selftest Tainted: G U 5.15.0-rc1-CI-Trybot_7984+ #1 Hardware name: Micro-Star International Co., Ltd. MS-7B54/Z370M MORTAR (MS-7B54), BIOS 1.00 10/31/2017 RIP: 0010:find_vma+0x4e/0xb0 Code: de 48 89 ef e8 d3 94 fe ff 48 85 c0 74 34 48 83 c4 08 5b 5d c3 48 8d bf 28 01 00 00 be ff ff ff ff e8 d6 46 8b 00 85 c0 75 c8 <0f> 0b 48 8b 85 b8 00 00 00 48 85 c0 75 c6 48 89 ef e8 12 26 87 00 RSP: 0018:ffffc900013df980 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00007f9df2b80000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffffff822e314c RDI: ffffffff8233c83f RBP: ffff88811bafc840 R08: ffff888107d0ddb8 R09: 00000000fffffffe R10: 0000000000000001 R11: 00000000ffbae7ba R12: 0000000000000000 R13: 0000000000000000 R14: ffff88812a710000 R15: ffff888114fa42c0 FS: 00007f9def9d4c00(0000) GS:ffff888266580000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f799627fe50 CR3: 000000011bbc2006 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __igt_mmap+0xe0/0x490 [i915] igt_mmap+0xd2/0x160 [i915] ? __trace_bprintk+0x6e/0x80 __i915_subtests.cold.7+0x42/0x92 [i915] ? i915_perf_selftests+0x20/0x20 [i915] ? __i915_nop_setup+0x10/0x10 [i915] __run_selftests.part.3+0x10d/0x172 [i915] i915_live_selftests.cold.5+0x1f/0x47 [i915] i915_pci_probe+0x93/0x1d0 [i915] Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Closes: https://gitlab.freedesktop.org/drm/intel/issues/4129 Link: https://patchwork.freedesktop.org/patch/msgid/20210915105946.394412-1-maarten.lankhorst@linux.intel.com Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2021-09-15Merge drm/drm-next into drm-intel-gt-nextJoonas Lahtinen
Close the divergence which has caused patches not to apply and have a solid baseline for the PXP patches that Rodrigo will send a topic branch PR for. Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2021-09-14drm/i915/dg2: Define MOCS table for DG2Matt Roper
Bspec: 45101, 45427 Cc: Ramalingam C <ramalingam.c@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210904003544.2422282-3-matthew.d.roper@intel.com
2021-09-14drm/i915/xehpsdv: Define MOCS table for XeHP SDVLucas De Marchi
Like DG1, XeHP SDV doesn't have LLC/eDRAM control values due to being a dgfx card. XeHP SDV adds 2 more bits: L3_GLBGO to "push the Go point to memory for L3 destined transaction" and L3_LKP to "enable Lookup for uncacheable accesses". Bspec: 45101 Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210904003544.2422282-2-matthew.d.roper@intel.com
2021-09-14drm/i915: Enable -Wsometimes-uninitializedNathan Chancellor
This warning helps catch uninitialized variables. It should have been enabled at the same time as commit b2423184ac33 ("drm/i915: Enable -Wuninitialized") but I did not realize they were disabled separately. Enable it now that i915 is clean so that it stays that way. Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210824225427.2065517-4-nathan@kernel.org
2021-09-14drm/i915/selftests: Always initialize err in ↵Nathan Chancellor
igt_dmabuf_import_same_driver_lmem() Clang warns: drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:127:13: warning: variable 'err' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] } else if (PTR_ERR(import) != -EOPNOTSUPP) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:138:9: note: uninitialized use occurs here return err; ^~~ drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:127:9: note: remove the 'if' if its condition is always true } else if (PTR_ERR(import) != -EOPNOTSUPP) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:95:9: note: initialize the variable 'err' to silence this warning int err; ^ = 0 The test is expected to pass if i915_gem_prime_import() returns -EOPNOTSUPP so initialize err to zero in this case. Fixes: cdb35d1ed6d2 ("drm/i915/gem: Migrate to system at dma-buf attach time (v7)") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210824225427.2065517-3-nathan@kernel.org
2021-09-14drm/i915/selftests: Do not use import_obj uninitializedNathan Chancellor
Clang warns a couple of times: drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:63:6: warning: variable 'import_obj' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized] if (import != &obj->base) { ^~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:80:22: note: uninitialized use occurs here i915_gem_object_put(import_obj); ^~~~~~~~~~ drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:63:2: note: remove the 'if' if its condition is always false if (import != &obj->base) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c:38:46: note: initialize the variable 'import_obj' to silence this warning struct drm_i915_gem_object *obj, *import_obj; ^ = NULL Shuffle the import_obj initialization above these if statements so that it is not used uninitialized. Fixes: d7b2cb380b3a ("drm/i915/gem: Correct the locking and pin pattern for dma-buf (v8)") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210824225427.2065517-2-nathan@kernel.org
2021-09-13drm/i915/guc: Add GuC kernel docMatthew Brost
Add GuC kernel doc for all structures added thus far for GuC submission and update the main GuC submission section with the new interface details. v2: - Drop guc_active.lock DOC v3: - Fixup a few kernel doc comments (Daniele) v4 (Daniele): - Implement doc suggestions from John - Add kerneldoc for all members of the GuC structure and pull the file in i915.rst v5 (Daniele): - Implement new doc suggestions from John Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-24-matthew.brost@intel.com
2021-09-13drm/i915/guc: Drop guc_active move everything into guc_stateMatthew Brost
Now that we have locking hierarchy of sched_engine->lock -> ce->guc_state everything from guc_active can be moved into guc_state and protected the guc_state.lock. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-23-matthew.brost@intel.com
2021-09-13drm/i915/guc: Move fields protected by guc->contexts_lock into sub structureMatthew Brost
To make ownership of locking clear move fields (guc_id, guc_id_ref, guc_id_link) to sub structure guc_id in intel_context. Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-22-matthew.brost@intel.com
2021-09-13drm/i915/guc: Move GuC priority fields in context under guc_activeMatthew Brost
Move GuC management fields in context under guc_active struct as this is where the lock that protects theses fields lives. Also only set guc_prio field once during context init. v2: (Daniele) - set CONTEXT_SET_INIT Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-21-matthew.brost@intel.com
2021-09-13drm/i915/guc: Drop pin count check trick between sched_disable and re-pinMatthew Brost
Drop pin count check trick between a sched_disable and re-pin, now rely on the lock and counter of the number of committed requests to determine if scheduling should be disabled on the context. Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-20-matthew.brost@intel.com
2021-09-13drm/i915/guc: Proper xarray usage for contexts_lookupMatthew Brost
Lock the xarray and take ref to the context if needed. v2: (Checkpatch) - Add new line after declaration (Daniel Vetter) - Correct put / get accounting in xa_for_loops v3: (Checkpatch) - Extra new line Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-19-matthew.brost@intel.com
2021-09-13drm/i915/guc: Rework and simplify lockingMatthew Brost
Rework and simplify the locking with GuC subission. Drop sched_state_no_lock and move all fields under the guc_state.sched_state and protect all these fields with guc_state.lock . This requires changing the locking hierarchy from guc_state.lock -> sched_engine.lock to sched_engine.lock -> guc_state.lock. v2: (Daniele) - Don't check fields outside of lock during sched disable, check less fields within lock as some of the outside are no longer needed Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-18-matthew.brost@intel.com
2021-09-13drm/i915/guc: Move guc_blocked fence to struct guc_stateMatthew Brost
Move guc_blocked fence to struct guc_state as the lock which protects the fence lives there. s/ce->guc_blocked/ce->guc_state.blocked/g v2: (Daniele) - s/blocked_fence/blocked/g Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-17-matthew.brost@intel.com
2021-09-13drm/i915/guc: Release submit fence from an irq_workMatthew Brost
A subsequent patch will flip the locking hierarchy from ce->guc_state.lock -> sched_engine->lock to sched_engine->lock -> ce->guc_state.lock. As such we need to release the submit fence for a request from an IRQ to break a lock inversion - i.e. the fence must be release went holding ce->guc_state.lock and the releasing of the can acquire sched_engine->lock. v2: (Daniele) - Delete request from list before calling irq_work_queue Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-16-matthew.brost@intel.com
2021-09-13drm/i915/guc: Reset LRC descriptor if register returns -ENODEVMatthew Brost
Reset LRC descriptor if a context register returns -ENODEV as this means we are mid-reset. Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-15-matthew.brost@intel.com
2021-09-13drm/i915/guc: Don't touch guc_state.sched_state without a lockMatthew Brost
Before we did some clever tricks to not use the a lock when touching guc_state.sched_state in certain cases. Don't do that, enforce the use of the lock. v2: (kernel test robo ) - Add __maybe_unused to sched_state_is_init() v3: rebase after the unused code path removal has been moved to an earlier patch. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-14-matthew.brost@intel.com
2021-09-13drm/i915/guc: Take context ref when cancelling requestMatthew Brost
A context can get destroyed after cancelling a request, if a context or GT reset occurs, so take a reference to context when cancelling a request. Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-13-matthew.brost@intel.com
2021-09-13drm/i915/selftests: Add initial GuC selftest for scrubbing lost G2HMatthew Brost
While debugging an issue with full GT resets I went down a rabbit hole thinking the scrubbing of lost G2H wasn't working correctly. This proved to be incorrect as this was working just fine but this chase inspired me to write a selftest to prove that this works. This simple selftest injects errors dropping various G2H and then issues a full GT reset proving that the scrubbing of these G2H doesn't blow up. v2: (Daniel Vetter) - Use ifdef instead of macros for selftests v3: (Checkpatch) - A space after 'switch' statement v4: (Daniele) - A comment saying GT won't idle if G2H are lost Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-12-matthew.brost@intel.com
2021-09-13drm/i915/guc: Copy whole golden context, set engine state size of subsetMatthew Brost
When the GuC does a media reset, it copies a golden context state back into the corrupted context's state. The address of the golden context and the size of the engine state restore are passed in via the GuC ADS. The i915 had a bug where it passed in the whole size of the golden context, not the size of the engine state to restore resulting in a memory corruption. Also copy the entire golden context on init rather than just the engine state that is restored. v2 (Daniele): use defines to avoid duplicated const variables (John). Fixes: 481d458caede ("drm/i915/guc: Add golden context to GuC ADS") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-11-matthew.brost@intel.com
2021-09-13drm/i915/guc: Don't enable scheduling on a banned context, guc_id invalid, ↵Matthew Brost
not registered When unblocking a context, do not enable scheduling if the context is banned, guc_id invalid, or not registered. v2: (Daniele) - Add helper for unblock Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-10-matthew.brost@intel.com
2021-09-13drm/i915/guc: Kick tasklet after queuing a requestMatthew Brost
Kick tasklet after queuing a request so it submitted in a timely manner. Fixes: 3a4cdf1982f0 ("drm/i915/guc: Implement GuC context operations for new inteface") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-9-matthew.brost@intel.com
2021-09-13Revert "drm/i915/gt: Propagate change in error status to children on unhold"Matthew Brost
Propagating errors to dependent fences is broken and can lead to errors from one client ending up in another. In commit 3761baae908a ("Revert "drm/i915: Propagate errors on awaiting already signaled fences""), we attempted to get rid of fence error propagation but missed the case added in commit 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to children on unhold"). Revert that one too. This error was found by an up-and-coming selftest which triggers a reset during request cancellation and verifies that subsequent requests complete successfully. v2: (Daniel Vetter) - Use revert v3: (Jason) - Update commit message v4 (Daniele): - fix checkpatch error in commit message. References: '3761baae908a ("Revert "drm/i915: Propagate errors on awaiting already signaled fences"")' Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-8-matthew.brost@intel.com
2021-09-13drm/i915/guc: Workaround reset G2H is received after schedule done G2HMatthew Brost
If the context is reset as a result of the request cancellation the context reset G2H is received after schedule disable done G2H which is the wrong order. The schedule disable done G2H release the waiting request cancellation code which resubmits the context. This races with the context reset G2H which also wants to resubmit the context but in this case it really should be a NOP as request cancellation code owns the resubmit. Use some clever tricks of checking the context state to seal this race until the GuC firmware is fixed. v2: (Checkpatch) - Fix typos v3: (Daniele) - State that is a bug in the GuC firmware Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-7-matthew.brost@intel.com
2021-09-13drm/i915/guc: Process all G2H message at once in work queueMatthew Brost
Rather than processing 1 G2H at a time and re-queuing the work queue if more messages exist, process all the G2H in a single pass of the work queue. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-6-matthew.brost@intel.com
2021-09-13drm/i915/guc: Don't drop ce->guc_active.lock when unwinding contextMatthew Brost
Don't drop ce->guc_active.lock when unwinding a context after reset. At one point we had to drop this because of a lock inversion but that is no longer the case. It is much safer to hold the lock so let's do that. Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface") Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-5-matthew.brost@intel.com
2021-09-13drm/i915/guc: Unwind context requests in reverse orderMatthew Brost
When unwinding requests on a reset context, if other requests in the context are in the priority list the requests could be resubmitted out of seqno order. Traverse the list of active requests in reverse and append to the head of the priority list to fix this. Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-4-matthew.brost@intel.com
2021-09-13drm/i915/guc: Fix outstanding G2H accountingMatthew Brost
A small race that could result in incorrect accounting of the number of outstanding G2H. Basically prior to this patch we did not increment the number of outstanding G2H if we encoutered a GT reset while sending a H2G. This was incorrect as the context state had already been updated to anticipate a G2H response thus the counter should be incremented. As part of this change we remove a legacy (now unused) path that was the last caller requiring a G2H response that was not guaranteed to loop. This allows us to simplify the accounting as we don't need to handle the case where the send fails due to the channel being busy. Also always use helper when decrementing this value. v2 (Daniele): update GEM_BUG_ON check, pull in dead code removal from later patch, remove loop param from context_deregister. Fixes: f4eb1f3fe946 ("drm/i915/guc: Ensure G2H response has space in buffer") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: <stable@vger.kernel.org> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-3-matthew.brost@intel.com
2021-09-13drm/i915/guc: Fix blocked context accountingMatthew Brost
Prior to this patch the blocked context counter was cleared on init_sched_state (used during registering a context & resets) which is incorrect. This state needs to be persistent or the counter can read the incorrect value resulting in scheduling never getting enabled again. Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: <stable@vger.kernel.org> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-2-matthew.brost@intel.com
2021-09-12Linux 5.15-rc1v5.15-rc1Linus Torvalds
2021-09-12Merge tag 'perf-tools-for-v5.15-2021-09-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull more perf tools updates from Arnaldo Carvalho de Melo: - Add missing fields and remove some duplicate fields when printing a perf_event_attr. - Fix hybrid config terms list corruption. - Update kernel header copies, some resulted in new kernel features being automagically added to 'perf trace' syscall/tracepoint argument id->string translators. - Add a file generated during the documentation build to .gitignore. - Add an option to build without libbfd, as some distros, like Debian consider its ABI unstable. - Add support to print a textual representation of IBS raw sample data in 'perf report'. - Fix bpf 'perf test' sample mismatch reporting - Fix passing arguments to stackcollapse report in a 'perf script' python script. - Allow build-id with trailing zeros. - Look for ImageBase in PE file to compute .text offset. * tag 'perf-tools-for-v5.15-2021-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (25 commits) tools headers UAPI: Update tools's copy of drm.h headers tools headers UAPI: Sync drm/i915_drm.h with the kernel sources tools headers UAPI: Sync linux/fs.h with the kernel sources tools headers UAPI: Sync linux/in.h copy with the kernel sources perf tools: Add an option to build without libbfd perf tools: Allow build-id with trailing zeros perf tools: Fix hybrid config terms list corruption perf tools: Factor out copy_config_terms() and free_config_terms() perf tools: Fix perf_event_attr__fprintf() missing/dupl. fields perf tools: Ignore Documentation dependency file perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions tools include UAPI: Update linux/mount.h copy perf beauty: Cover more flags in the move_mount syscall argument beautifier tools headers UAPI: Sync linux/prctl.h with the kernel sources tools include UAPI: Sync sound/asound.h copy with the kernel sources tools headers UAPI: Sync linux/kvm.h with the kernel sources tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources perf report: Add support to print a textual representation of IBS raw sample data perf report: Add tools/arch/x86/include/asm/amd-ibs.h perf env: Add perf_env__cpuid, perf_env__{nr_}pmu_mappings ...
2021-09-12Merge tag 'compiler-attributes-for-linus-v5.15-rc1-v2' of ↵Linus Torvalds
git://github.com/ojeda/linux Pull compiler attributes updates from Miguel Ojeda: - Fix __has_attribute(__no_sanitize_coverage__) for GCC 4 (Marco Elver) - Add Nick as Reviewer for compiler_attributes.h (Nick Desaulniers) - Move __compiletime_{error|warning} (Nick Desaulniers) * tag 'compiler-attributes-for-linus-v5.15-rc1-v2' of git://github.com/ojeda/linux: compiler_attributes.h: move __compiletime_{error|warning} MAINTAINERS: add Nick as Reviewer for compiler_attributes.h Compiler Attributes: fix __has_attribute(__no_sanitize_coverage__) for GCC 4
2021-09-12Merge tag 'auxdisplay-for-linus-v5.15-rc1' of git://github.com/ojeda/linuxLinus Torvalds
Pull auxdisplay updates from Miguel Ojeda: "An assortment of improvements for auxdisplay: - Replace symbolic permissions with octal permissions (Jinchao Wang) - ks0108: Switch to use module_parport_driver() (Andy Shevchenko) - charlcd: Drop unneeded initializers and switch to C99 style (Andy Shevchenko) - hd44780: Fix oops on module unloading (Lars Poeschel) - Add I2C gpio expander example (Ralf Schlatterbeck)" * tag 'auxdisplay-for-linus-v5.15-rc1' of git://github.com/ojeda/linux: auxdisplay: Replace symbolic permissions with octal permissions auxdisplay: ks0108: Switch to use module_parport_driver() auxdisplay: charlcd: Drop unneeded initializers and switch to C99 style auxdisplay: hd44780: Fix oops on module unloading auxdisplay: Add I2C gpio expander example
2021-09-12Merge tag 'smp-urgent-2021-09-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull CPU hotplug updates from Thomas Gleixner: "Updates for the SMP and CPU hotplug: - Remove DEFINE_SMP_CALL_CACHE_FUNCTION() which is a left over of the original hotplug code and now causing trouble with the ARM64 cache topology setup due to the pointless SMP function call. It's not longer required as the hotplug callbacks are guaranteed to be invoked on the upcoming CPU. - Remove the deprecated and now unused CPU hotplug functions - Rewrite the CPU hotplug API documentation" * tag 'smp-urgent-2021-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation: core-api/cpuhotplug: Rewrite the API section cpu/hotplug: Remove deprecated CPU-hotplug functions. thermal: Replace deprecated CPU-hotplug functions. drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()
2021-09-12Merge tag 'char-misc-5.15-rc1-lkdtm' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull misc driver fix from Greg KH: "Here is a single patch for 5.15-rc1, for the lkdtm misc driver. It resolves a build issue that many people were hitting with your current tree, and Kees and others felt would be good to get merged before -rc1 comes out, to prevent them from having to constantly hit it as many development trees restart on -rc1, not older -rc releases. It has NOT been in linux-next, but has passed 0-day testing and looks 'obviously correct' when reviewing it locally :)" * tag 'char-misc-5.15-rc1-lkdtm' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: lkdtm: Use init_uts_ns.name instead of macros
2021-09-12Merge tag 'for-linus-5.15-1' of git://github.com/cminyard/linux-ipmiLinus Torvalds
Pull IPMI updates from Corey Minyard: "A couple of very minor fixes for style and rate limiting. Nothing big, but probably needs to go in" * tag 'for-linus-5.15-1' of git://github.com/cminyard/linux-ipmi: char: ipmi: use DEVICE_ATTR helper macro ipmi: rate limit ipmi smi_event failure message
2021-09-12Merge tag 'sched_urgent_for_v5.15_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Make sure the idle timer expires in hardirq context, on PREEMPT_RT - Make sure the run-queue balance callback is invoked only on the outgoing CPU * tag 'sched_urgent_for_v5.15_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Prevent balance_push() on remote runqueues sched/idle: Make the idle timer expire in hard interrupt context
2021-09-12Merge tag 'locking_urgent_for_v5.15_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Borislav Petkov: - Fix the futex PI requeue machinery to not return to userspace in inconsistent state - Avoid a potential null pointer dereference in the ww_mutex deadlock check - Other smaller cleanups and optimizations * tag 'locking_urgent_for_v5.15_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rtmutex: Fix ww_mutex deadlock check futex: Remove unused variable 'vpid' in futex_proxy_trylock_atomic() futex: Avoid redundant task lookup futex: Clarify comment for requeue_pi_wake_futex() futex: Prevent inconsistent state and exit race futex: Return error code instead of assigning it without effect locking/rwsem: Add missing __init_rwsem() for PREEMPT_RT