summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gt/selftest_lrc.c
AgeCommit message (Collapse)Author
2020-09-07drm/i915/selftests: Fix locking inversion in lrc selftest.Maarten Lankhorst
This function does not use intel_context_create_request, so it has to use the same locking order as normal code. This is required to shut up lockdep in selftests. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-20-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin.Maarten Lankhorst
As a preparation step for full object locking and wait/wound handling during pin and object mapping, ensure that we always pass the ww context in i915_gem_execbuffer.c to i915_vma_pin, use lockdep to ensure this happens. This also requires changing the order of eb_parse slightly, to ensure we pass ww at a point where we could still handle -EDEADLK safely. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-15-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915/gem: Remove disordered per-file request list for throttlingChris Wilson
I915_GEM_THROTTLE dates back to the time before contexts where there was just a single engine, and therefore a single timeline and request list globally. That request list was in execution/retirement order, and so walking it to find a particular aged request made sense and could be split per file. That is no more. We now have many timelines with a file, as many as the user wants to construct (essentially per-engine, per-context). Each of those run independently and so make the single list futile. Remove the disordered list, and iterate over all the timelines to find a request to wait on in each to satisfy the criteria that the CPU is no more than 20ms ahead of its oldest request. It should go without saying that the I915_GEM_THROTTLE ioctl is no longer used as the primary means of throttling, so it makes sense to push the complication into the ioctl where it only impacts upon its few irregular users, rather than the execbuf/retire where everybody has to pay the cost. Fortunately, the few users do not create vast amount of contexts, so the loops over contexts/engines should be concise. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200728152010.30701-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-07-08drm/i915: Move the engine mask to intel_gt_infoDaniele Ceraolo Spurio
Since the engines belong to the GT, move the runtime-updated list of available engines to the intel_gt struct. The original mask has been renamed to indicate it contains the maximum engine list that can be found on a matching device. In preparation for other info being moved to the gt in follow up patches (sseu), introduce an intel_gt_info structure to group all gt-related runtime info. v2: s/max_engine_mask/platform_engine_mask (tvrtko), fix selftest Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-5-daniele.ceraolospurio@intel.com
2020-06-18drm/i915/selftests: Enable selftesting of busy-statsChris Wilson
A couple of very simple tests to ensure that the basic properties of per-engine busyness accounting [0% and 100% busy] are faithful. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200617130916.15261-1-chris@chris-wilson.co.uk
2020-06-17drm/i915/selftests: fix spelling mistake "submited" -> "submitted"Colin Ian King
There is a spelling mistake in a pr_err message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200617085207.167552-1-colin.king@canonical.com
2020-06-17drm/i915/selftests: Check preemption rollback of different ring queue depthsChris Wilson
Like live_unlite_ring, but instead of simply looking at the impact of intel_ring_direction(), check that preemption more generally works with different depths of queued requests in the ring. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200616233733.18050-1-chris@chris-wilson.co.uk
2020-06-17drm/i915/selftests: Use friendly request names for live_timeslice_rewindChris Wilson
Rather than mixing [012] and (A1, A2, B2) for the request indices, use the enums throughout. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200616185518.11948-2-chris@chris-wilson.co.uk
2020-06-17drm/i915/selftests: Exercise far preemption rollbacksChris Wilson
Not too long ago, we realised we had issues with a rolling back a context so far for a preemption request we considered the resubmit not to be a rollback but a forward roll. This means we would issue a lite restore instead of forcing a full restore, continuing execution of the old requests rather than causing a preemption. Add a selftest to exercise such a far rollback, such that if we were to skip the full restore, we would execute invalid instructions in the ring and hang. Note that while I was able to confirm that this causes us to do a lite-restore preemption rollback (with commit e36ba817fa96 ("drm/i915/gt: Incrementally check for rewinding") disabled), it did not trick the HW into rolling past the old RING_TAIL. Myybe on other HW. References: e36ba817fa96 ("drm/i915/gt: Incrementally check for rewinding") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200616185518.11948-1-chris@chris-wilson.co.uk
2020-06-16drm/i915/selftests: Fix inconsistent IS_ERR and PTR_ERRGustavo A. R. Silva
Fix inconsistent IS_ERR and PTR_ERR in live_timeslice_nopreempt(). The proper pointer to be passed as argument to PTR_ERR() is ce. This bug was detected with the help of Coccinelle. Fixes: b72f02d78e4f ("drm/i915/gt: Prevent timeslicing into unpreemptable requests") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200616145452.GA25291@embeddedor
2020-06-15drm/i915/selftests: Disable preemptive heartbeats over preemption testsChris Wilson
Since the heartbeat may cause a preemption event, disable it over the preemption suppression tests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200615165013.22973-1-chris@chris-wilson.co.uk
2020-06-13drm/i915/selftests: Trim execlists runtimeChris Wilson
Reduce the smoke depth by trimming the number of contexts, repetitions and wait times. This is in preparation for a less greedy scheduler that tries to be fair across contexts, resulting in a great many more context switches. A thousand context switches may be 50-100ms, causing us to timeout as the HW is not fast enough to complete the deep smoketests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200607222108.14401-5-chris@chris-wilson.co.uk
2020-06-11drm/i915/selftests: Remove live_suppress_wait_preemptChris Wilson
With the removal of the internal wait-priority boosting, we can also remove the selftest to ensure that those waits were being suppressed from causing preemptions. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200607222108.14401-4-chris@chris-wilson.co.uk
2020-06-04drm/i915: Trim set_timer_ms() intervalsChris Wilson
Use the plain msec_to_jiffies() rather than the _timeout variant so we round down and do not add an extra jiffy to our interval. For example, with timeslicing we do not want to err on the longer side as any fairness depends on catching hogging contexts on the GPU. Bring on CFS. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604135938.3975-1-chris@chris-wilson.co.uk
2020-05-28drm/i915/gt: Prevent timeslicing into unpreemptable requestsChris Wilson
We have a I915_REQUEST_NOPREEMPT flag that we set when we must prevent the HW from preempting during the course of this request. We need to honour this flag and protect the HW even if we have a heartbeat request, or other maximum priority barrier, pending. As such, restrict the timeslicing check to avoid preempting into the topmost priority band, leaving the unpreemptable requests in blissful peace running uninterrupted on the HW. v2: Set the I915_PRIORITY_BARRIER to be less than I915_PRIORITY_UNPREEMPTABLE so that we never submit a request (heartbeat or barrier) that can legitimately preempt the current non-premptable request. Fixes: 2a98f4e65bba ("drm/i915: add infrastructure to hold off preemption on a request") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200527162418.24755-1-chris@chris-wilson.co.uk
2020-05-21drm/i915/selftests: Flush the submission, not cancel it!Chris Wilson
Use intel_engine_flush_submission() when we want to ensure that the tasklet is run. tasklet_kill(), while it may ensure that an ongoing tasklet is completed, also prevents the tasklet from running if it's already scheduled and hasn't yet been run. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1874 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200521124304.3157692-1-chris@chris-wilson.co.uk
2020-05-19drm/i915/selftests: Add tests for timeslicing virtual enginesChris Wilson
Make sure that we can execute a virtual request on an already busy engine, and conversely that we can execute a normal request if the engines are already fully occupied by virtual requests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200519132046.22443-1-chris@chris-wilson.co.uk
2020-05-19drm/i915/selftests: Check for an initial-breadcrumb in wait_for_submit()Chris Wilson
When we look at i915_request_is_started() we must be careful in case we are using a request that does not have the initial-breadcrumb and instead the is-started is being compared against the end of the previous request. This will make wait_for_submit() declare that a request has been already submitted too early. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200519063123.20673-4-chris@chris-wilson.co.uk
2020-05-19drm/i915/selftests: Restore to default heartbeatChris Wilson
Since we temporarily disable the heartbeat and restore back to the default value, we can use the stored defaults on the engine and avoid using a local. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200519063123.20673-3-chris@chris-wilson.co.uk
2020-05-19drm/i915/selftests: Change priority overflow detectionChris Wilson
Check for integer overflow in the priority chain, rather than against a type-constricted max-priority check. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200519063123.20673-2-chris@chris-wilson.co.uk
2020-05-18drm/i915/selftests: Refactor sibling selectionChris Wilson
Tvrtko spotted that some selftests were using 'break' not 'continue', which will fail for discontiguous engine layouts such as on Icelake (which may have vcs0 and vcs2). Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200518102911.3463-1-chris@chris-wilson.co.uk
2020-05-05drm/i915/gt: Stop holding onto the pinned_default_stateChris Wilson
As we only restore the default context state upon banning a context, we only need enough of the state to run the ring and nothing more. That is we only need our bare protocontext. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504180745.15645-1-chris@chris-wilson.co.uk
2020-05-04drm/i915/gem: Implement legacy MI_STORE_DATA_IMMChris Wilson
The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but left them writing to a physical address. The notes suggest that the primary reason would be so that the writes were cache coherent, as the CPU cache uses physical tagging. As such we did not implement the legacy variant of MI_STORE_DATA_IMM and so left all the relocations synchronous -- but with a small function to convert from the vma address into the physical address, we can implement asynchronous relocs on these older arches, fixing up a few tests that require them. In order to be able to test the legacy paths, refactor the gpu relocations so that we can hook them up to a selftest. v2: Use an array of offsets not enum labels for the selftest v3: Refactor the common igt_hexdump() Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504140629.28240-1-chris@chris-wilson.co.uk
2020-04-29drm/i915/gt: Keep a no-frills swappable copy of the default context stateChris Wilson
We need to keep the default context state around to instantiate new contexts (aka golden rendercontext), and we also keep it pinned while the engine is active so that we can quickly reset a hanging context. However, the default contexts are large enough to merit keeping in swappable memory as opposed to kernel memory, so we store them inside shmemfs. Currently, we use the normal GEM objects to create the default context image, but we can throw away all but the shmemfs file. This greatly simplifies the tricky power management code which wants to run underneath the normal GT locking, and we definitely do not want to use any high level objects that may appear to recurse back into the GT. Though perhaps the primary advantage of the complex GEM object is that we aggressively cache the mapping, but here we are recreating the vm_area everytime time we unpark. At the worst, we add a lightweight cache, but first find a microbenchmark that is impacted. Having started to create some utility functions to make working with shmemfs objects easier, we can start putting them to wider use, where GEM objects are overkill, such as storing persistent error state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429172429.6054-1-chris@chris-wilson.co.uk
2020-04-29drm/i915/selftests: fix error handling in __live_lrc_indirect_ctx_bb()Dan Carpenter
If intel_context_create() fails then it leads to an error pointer dereference. I shuffled things around to make error handling easier. Fixes: 1dd47b54baea ("drm/i915: Add live selftests for indirect ctx batchbuffers") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200429132425.GE815283@mwanda
2020-04-25drm/i915: Use indirect ctx bb to mend CMD_BUF_CCTLMika Kuoppala
Use indirect ctx bb to load cmd buffer control value from context image to avoid corruption. v2: add to lrc layout (Chris) v3: end to a cacheline (Chris) v4: add to lrc fixed (Chris) v5: value in offset+1 Testcase: igt/i915_selftest/gt_lrc Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424230632.30333-1-mika.kuoppala@linux.intel.com
2020-04-25drm/i915: Add live selftests for indirect ctx batchbuffersMika Kuoppala
Indirect ctx batchbuffers are a hw feature of which batch can be run, by hardware, during context restoration stage. Driver can setup a batchbuffer and also an offset into the context image. When context image is marshalled from memory to registers, and when the offset from the start of context register state is equal of what driver pre-determined, batch will run. So one can manipulate context restoration process at cacheline granularity, given some limitations, as you need to have rudimentaries in place before you can run a batch. Add selftest which will write the ring start register to a canary spot. This will test that hardware will run a batchbuffer for the context in question. v2: request wait fix, naming (Chris) v3: test order (Chris) v4: rebase Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424214841.28076-3-mika.kuoppala@linux.intel.com
2020-04-25drm/i915: Add engine scratch register to live_lrc_fixedMika Kuoppala
General purpose registers are per engine and in a fixed location. Add to live_lrc_fixed. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424214841.28076-1-mika.kuoppala@linux.intel.com
2020-04-24drm/i915: Make define for lrc state offsetMika Kuoppala
More often than not, we need a byte offset into lrc register state from the start of the hw state. Make it so. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200423182355.21837-3-mika.kuoppala@linux.intel.com
2020-04-24drm/i915/selftests: Add context batchbuffers registers to live_lrc_fixedMika Kuoppala
Add per ctx bb and indirect ctx bb register locations to live_lrc_fixed for verification. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200423224159.22078-1-chris@chris-wilson.co.uk
2020-04-22drm/i915/selftests: Try to detect rollback during batchbuffer preemptionChris Wilson
Since batch buffers dominant execution time, most preemption requests should naturally occur during execution of a batch buffer. We wish to verify that should a preemption occur within a batch buffer, when we come to restart that batch buffer, it occurs at the interrupted instruction and most importantly does not rollback to an earlier point. v2: Do not clear the GPR at the start of the batch, but rely on them being clear for new contexts. Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422100903.25216-1-chris@chris-wilson.co.uk
2020-04-10drm/i915/selftests: Check for an already completed timesliceChris Wilson
With timeslice yielding on a semaphore, we may complete timeslices much faster than we were expecting and already have yielded the stuck request. Before complaining that timeslicing is not enabled, check that we haven't already applied the switch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200410081638.19893-1-chris@chris-wilson.co.uk
2020-04-08drm/i915/selftests: Take an explicit ref for rq->batchChris Wilson
Since we are peeking into the batch object of the request, it is beholden on us to hold a reference to it. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1634 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200408091723.28937-1-chris@chris-wilson.co.uk
2020-04-08drm/i915/selftests: Drop vestigal timeslicing assertChris Wilson
Since the semaphore interrupt may cause us to yield the timeslice immediately, we may cancel the timer before we notice the submission is complete. The assertion is no longer valid due to the race with the interrupt. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200407222625.15542-1-chris@chris-wilson.co.uk
2020-04-07drm/i915/gt: Yield the timeslice if caught waiting on a user semaphoreChris Wilson
If we find ourselves waiting on a MI_SEMAPHORE_WAIT, either within the user batch or in our own preamble, the engine raises a GT_WAIT_ON_SEMAPHORE interrupt. We can unmask that interrupt and so respond to a semaphore wait by yielding the timeslice, if we have another context to yield to! The only real complication is that the interrupt is only generated for the start of the semaphore wait, and is asynchronous to our process_csb() -- that is, we may not have registered the timeslice before we see the interrupt. To ensure we don't miss a potential semaphore blocking forward progress (e.g. selftests/live_timeslice_preempt) we mark the interrupt and apply it to the next timeslice regardless of whether it was active at the time. v2: We use semaphores in preempt-to-busy, within the timeslicing implementation itself! Ergo, when we do insert a preemption due to an expired timeslice, the new context may start with the missed semaphore flagged by the retired context and be yielded, ad infinitum. To avoid this, read the context id at the time of the semaphore interrupt and only yield if that context is still active. Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200407130811.17321-1-chris@chris-wilson.co.uk
2020-04-03drm/i915/selftests: Wait until we start timeslicing after a submitChris Wilson
If we submit, we do not start timeslicing until we process the CS event that marks the start of the context running on HW. So in the selftest, be sure to wait until we have processed the pending events before asserting that timeslicing has begun. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200403190209.21818-1-chris@chris-wilson.co.uk
2020-04-02drm/i915/selftests: Check for has-reset before testing hostile contextsChris Wilson
In order to kill off a hostile context, we need to be able to reset the GPU. So check that is supported prior to beginning the test. Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200402205839.25065-1-chris@chris-wilson.co.uk
2020-03-31drm/i915: Report all failed registers for ctx isolationMika Kuoppala
For CI it is enough to point out a single failure in isolation. However it is beneficial to gather info in logs for transients further down the line. Do not stop into first comparison failure but continue probing forward. v2: for all engines and poisons (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200331135403.16906-1-mika.kuoppala@linux.intel.com
2020-03-31drm/i915/selftests: Tidy up an error message for live_error_interruptChris Wilson
Since we don't wait for the error interrupt to reset, restart and then complete the guilty request, clean up the error messages. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200331091459.29179-1-chris@chris-wilson.co.uk
2020-03-30drm/i915/selftests: Check timeout before flush and cond checksChris Wilson
Allow a bit of leniency for the CPU scheduler to be distracted while we flush the tasklet and so ensure that we always check the status of the request once more before timing out. v2: Wait until the HW acked the submit, and we do any secondary actions for the submit (e.g. timeslices) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200330121644.25277-1-chris@chris-wilson.co.uk
2020-03-13drm/i915/selftest: Add more poison patternsChris Wilson
Throw in the inverse patterns to create more examples of poison to use against the LRC state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200313102812.30173-1-chris@chris-wilson.co.uk
2020-03-03drm/i915/selftests: Fix uninitialized variableAditya Swarup
Static code analysis tool identified struct lrc_timestamp data as being uninitialized and then data.ce[] is being checked for NULL/negative value in the error path. Initializing data variable fixes the issue. Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Aditya Swarup <aditya.swarup@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200303142347.15696-1-aditya.swarup@intel.com
2020-02-28drm/i915/selftests: Be a little more lenient for reset workersChris Wilson
Give the reset worker a kick before losing help when waiting for hang recovery, as the CPU scheduler is a little unreliable. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-15-chris@chris-wilson.co.uk
2020-02-28drm/i915/selftests: Wait for the context switchChris Wilson
As we require a context switch to ensure that the current context is switched out and saved to memory, perform an explicit switch to the kernel context and wait for it. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1336 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200228082330.2411941-18-chris@chris-wilson.co.uk
2020-02-28drm/i915/selftests: Check recovery from corrupted LRCChris Wilson
Check that we can recover if the LRC is totally corrupted. Based on a very simple theory that anything that can be adjusted via the context (i.e. on behalf of the user), should be under the purview of the per-engine-reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-13-chris@chris-wilson.co.uk
2020-02-28drm/i915/selftests: Verify LRC isolationChris Wilson
Record the LRC registers before/after a preemption event to ensure that the first context sees nothing from the second client; at least in the normal per-context register state. References: https://gitlab.freedesktop.org/drm/intel/issues/1233 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-12-chris@chris-wilson.co.uk
2020-02-20drm/i915/guc: Kill USES_GUC_SUBMISSION macroDaniele Ceraolo Spurio
use intel_uc_uses_guc_submission() directly instead, to be consistent in the way we check what we want to do with the GuC. v2: do not go through ctx->vm->gt, use i915->gt instead Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> #v1 Reviewed-by: Andi Shyti <andi.shyti@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200218223327.11058-3-daniele.ceraolospurio@intel.com
2020-02-19drm/i915/selftests: Mark GPR checking more hostileChris Wilson
Currently, we check that a new context has a clear set of general purpose registers. Add a little bit of hostility by preempting our new context and re-poisoning the GPR to ensure that there is no context leakage from preemption. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ramalingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200219123418.1447428-1-chris@chris-wilson.co.uk
2020-02-19drm/i915/selftest: Analyse timestamp behaviour across context switchesChris Wilson
Check that the CTX_TIMESTAMP is monotonic across context save/restore and upon preemption. References: https://gitlab.freedesktop.org/drm/intel/issues/1233 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ramalingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200219112004.1412791-1-chris@chris-wilson.co.uk
2020-02-18drm/i915/selftests: Flush tasklet on wait_for_submit()Chris Wilson
Always flush the tasklet if we have pending submissions in wait_for_submit(), so that even if we see the HW has started before we process its ack, when we return the execlists state is well defined. Fixes: 06289949b8dd ("drm/i915/selftests: Check for any sign of request starting in wait_for_submit()") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200218211215.1336341-1-chris@chris-wilson.co.uk