linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2018-07-16	drm/amdgpu/pp: split out common smumgr smu9 code	Alex Deucher
	Split out the shared smumgr code for vega10 and 12 so we don't have duplicate code for both. Reviewed-by: Rex Zhu <rezhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-16	drm/amdgpu/pp: remove dead vega12 code	Alex Deucher
	Commented out. Reviewed-by: Rex Zhu <rezhu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-16	drm/i915/selftests: Force a preemption hang	Chris Wilson
	Inject a failure into preemption completion to pretend as if the HW didn't successfully handle preemption and we are forced to do a reset in the middle. v2: Wait for preemption, to force testing with the missed preemption. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180716132154.12539-1-chris@chris-wilson.co.uk
2018-07-16	drm/i915/execlists: Always clear preempt status on cancelling all	Chris Wilson
	On reset/wedging, we cancel all pending replies from the HW and we also want to cancel an outstanding preemption event. Since we use the same function to cancel the pending replies for reset and for a preemption event, we can simply clear the active tracking for all. v2: Keep execlists_user_end() markup for wedging v3: Move assignment to inline to hide the bare assignment. Fixes: 60a943245413 ("drm/i915/execlists: Drop clear_gtiir() on GPU reset") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180716125424.5715-1-chris@chris-wilson.co.uk
2018-07-16	drm/dp_helper: Add DP aux channel tracing	Lyude Paul
	This is something we've needed for a very long time now, as it makes debugging issues with faulty MST hubs along with debugging issues regarding us interfacing with hubs correctly vastly easier to debug. Currently this can actually be done if you trace the i2c devices for DP using ftrace but that's significantly less useful for a couple of reasons: - Tracing the i2c devices through ftrace means all of the traces are going to contain a lot of "garbage" output that we're sending over the i2c line. Most of this garbage comes from retrying transactions, DRM's helper library adding extra transactions to work around bad hubs, etc. - Having a user set up ftrace so that they can provide debugging information is a lot more difficult then being able to say "just boot with drm.debug=0x100" - We can potentially expand upon this tracing in the future to print debugging information in regards to other DP transactions like MST sideband transactions This is inspired by a patch Rob Clark sent to do this a long time back. Neither of us could find the patch however, so we both assumed it would probably just be easier to rewrite it anyway. Cc: Rob Clark <robdclark@gmail.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180716154432.13433-1-lyude@redhat.com
2018-07-16	drm: writeback: Fix doc that says connector should be disconnected	Alexandru Gheorghe
	During iteration process one of the proposed mechanism for not breaking existing userspace was to report writeback connectors as disconnected, however the final version used DRM_CLIENT_CAP_WRITEBACK_CONNECTORS for that purpose. Change-Id: I2319d099f7669094c8530f1521abdbca08e76486 Signed-off-by: Alexandru Gheorghe <alexandru-cosmin.gheorghe@arm.com> Reviewed-by: Sean Paul <seanpaul@chromium.org> Acked-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/238399/
2018-07-16	dma-buf: Move BUG_ON from _add_shared_fence to _add_shared_inplace	Michel Dänzer
	Fixes the BUG_ON spuriously triggering under the following circumstances: * reservation_object_reserve_shared is called with shared_count == shared_max - 1, so obj->staged is freed in preparation of an in-place update. * reservation_object_add_shared_fence is called with the first fence, after which shared_count == shared_max. * reservation_object_add_shared_fence is called with a follow-up fence from the same context. In the second reservation_object_add_shared_fence call, the BUG_ON triggers. However, nothing bad would happen in reservation_object_add_shared_inplace, since both fences are from the same context, so they only occupy a single slot. Prevent this by moving the BUG_ON to where an overflow would actually happen (e.g. if a buggy caller didn't call reservation_object_reserve_shared before). v2: * Fix description of breaking scenario (Christian König) * Add bugzilla reference Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/106418 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1 Reviewed-by: Christian König <christian.koenig@amd.com> # v1 Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20180704151405.10357-1-michel@daenzer.net
2018-07-16	drm/i915/execlists: Disable submission tasklet upon wedging	Chris Wilson
	If we declare the driver wedged before the GPU truly is, then we may see the GPU complete some CS events following our cancellation. This leaves us quite confused as we deleted all the bookkeeping and thus complain about the inconsistent state. We can just ignore the remaining events and let the GPU idle by not feeding it, and so avoid trying to racily overwrite shared state. We rely on there being a full GPU reset before unwedging, giving us the opportunity to reset the shared state. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107188 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180716080332.32283-4-chris@chris-wilson.co.uk
2018-07-16	drm/i915: Remove pci private pointer after destroying the device private	Chris Wilson
	On an aborted module load, we unwind and free our device private - but we left a dangling pointer to our privates inside the pci_device. After the attempted aborted unload, we may still get a call to i915_pci_remove() when the module is removed, potentially chasing stale data. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180716080332.32283-5-chris@chris-wilson.co.uk
2018-07-16	drm/i915/selftests: Downgrade igt_timeout message	Chris Wilson
	Give in, since CI continues to incorrectly insist that KERN_NOTICE is a warning and flags the timeout message as unwanted spam. At first, the intention was to use the message to indicate which tests might warrant an extended run, but virtually all tests require a timeout so it is simply not as interesting as first thought. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103667 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180716080332.32283-6-chris@chris-wilson.co.uk
2018-07-16	drm/meson: Make DMT timings parameters and pixel clock generic	Neil Armstrong
	Remove the modes timings tables for DMT modes and calculate the HW paremeters from the modes timings. Switch the DMT modes pixel clock calculation out of the static frequency list to a generic calculation from a range of possible PLL dividers. This patch is an intermediate step towards usage of the Common Clock Framwework for PLL setup, by reworking the code to have common sel_pll() function called by the CEA (HDMI) freq setup and the generic DMT frequencies setup, we should be able to simply call clk_set_rate() on the PLL clock handle in a near future. The CEA (HDMI) and CVBS modes needs very specific clock paths that CCF will never be able to determine by itself, so there is still some work to do for a full handoff to CCF handling the clocks. This setup permits setting non-CEA modes like : - 1600x900-60Hz - 1280x1024-75Hz - 1280x1024-60Hz - 1440x900-60Hz - 1366x768-60Hz - 1280x800-60Hz - 1152x864-75Hz - 1024x768-75Hz - 1024x768-70Hz - 1024x768-60Hz - 832x624-75Hz - 800x600-75Hz - 800x600-72Hz - 800x600-60Hz - 640x480-75Hz - 640x480-73Hz - 640x480-67Hz Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Acked-by: Jerome Brunet <jbrunet@baylibre.com> [narmstrong: fixed trivial checkpatch issues] Link: https://patchwork.freedesktop.org/patch/msgid/1531726814-14638-1-git-send-email-narmstrong@baylibre.com
2018-07-16	drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping	Thierry Reding
	Depending on the kernel configuration, early ARM architecture setup code may have attached the GPU to a DMA/IOMMU mapping that transparently uses the IOMMU to back the DMA API. Tegra requires special handling for IOMMU backed buffers (a special bit in the GPU's MMU page tables indicates the memory path to take: via the SMMU or directly to the memory controller). Transparently backing DMA memory with an IOMMU prevents Nouveau from properly handling such memory accesses and causes memory access faults. As a side-note: buffers other than those allocated in instance memory don't need to be physically contiguous from the GPU's perspective since the GPU can map them into contiguous buffers using its own MMU. Mapping these buffers through the IOMMU is unnecessary and will even lead to performance degradation because of the additional translation. One exception to this are compressible buffers which need large pages. In order to enable these large pages, multiple small pages will have to be combined into one large (I/O virtually contiguous) mapping via the IOMMU. However, that is a topic outside the scope of this fix and isn't currently supported. An implementation will want to explicitly create these large pages in the Nouveau driver, so detaching from a DMA/IOMMU mapping would still be required. Signed-off-by: Thierry Reding <treding@nvidia.com> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Nicolas Chauvet <kwizart@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device()	Thierry Reding
	Instead of setting the DMA ops pointer to NULL, set the correct, non-IOMMU ops depending on the device's coherency setting. Signed-off-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/secboot/acr: Remove VLA usage	Kees Cook
	In the quest to remove all stack VLA usage from the kernel[1], this allocates the working buffers before starting the writing so it won't abort in the middle. This needs an initial walk of the lists to figure out how large the buffer should be. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau: Replace drm_dev_unref with drm_dev_put	Thomas Zimmermann
	This patch unifies the naming of DRM functions for reference counting of struct drm_device. The resulting code is more aligned with the rest of the Linux kernel interfaces. Signed-off-by: Thomas Zimmermann <tdz@users.sourceforge.net> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau: Replace drm_gem_object_unreference_unlocked with put function	Thomas Zimmermann
	This patch unifies the naming of DRM functions for reference counting of struct drm_gem_object. The resulting code is more aligned with the rest of the Linux kernel interfaces. Signed-off-by: Thomas Zimmermann <tdz@users.sourceforge.net> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau: Replace drm_framebuffer_{un/reference} with put, get functions	Thomas Zimmermann
	This patch unifies the naming of DRM functions for reference counting of struct drm_framebuffer. The resulting code is more aligned with the rest of the Linux kernel interfaces. Signed-off-by: Thomas Zimmermann <tdz@users.sourceforge.net> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/nvif: remove const attribute from nvif_mclass	Nick Desaulniers
	Similar to commit 0bf8bf50eddc ("module: Remove const attribute from alias for MODULE_DEVICE_TABLE") Fixes many -Wduplicate-decl-specifier warnings due to the combination of const typeof() of already const variables. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/hwmon: potential uninitialized variables	Dan Carpenter
	Smatch complains that "value" can be uninitialized when kstrtol() returns -ERANGE. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau: Fix runtime PM leak in drm_open()	Lyude Paul
	Noticed this as I was skimming through, if we fail to allocate memory for cli we'll end up returning without dropping the runtime PM ref we got. Additionally, we'll even return the wrong return code! (ret most likely will == 0 here, we want -ENOMEM). Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/debugfs: Wake up GPU before doing any reclocking	Karol Herbst
	Fixes various reclocking related issues on prime systems. Signed-off-by: Karol Herbst <karolherbst@gmail.com> Signed-off-by: Martin Peres <martin.peres@free.fr> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/bios/vpstate: There are some fermi vbios with no boost or tdp entry	Karol Herbst
	If the entry size is too small, default to invalid values for both boost_id and tdp_id, so as to default to the base clock in both cases. Signed-off-by: Karol Herbst <karolherbst@gmail.com> Signed-off-by: Martin Peres <martin.peres@free.fr> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/kms/nv50-: Allow vblank_disable_immediate	Mario Kleiner
	With instantaneous high precision vblank timestamping that updates at leading edge of vblank, the emulated "hw vblank counter" from vblank timestamping, which increments at leading edge of vblank, and reliable page flip execution and completion at leading edge of vblank, we should meet the requirements for fast/ immediate vblank irq disable/enable. This is only allowed on nv50+ gpu's, ie. the ones with atomic modesetting. One requirement for immediate vblank disable is that high precision vblank timestamping works reliably all the time on all connectors. This is not the case on all pre-nv50 parts for analog VGA outputs, where we currently don't always have support for scanout position queries and therefore fall back to vblank interrupt timestamping. The implementation in nv04_head_state() does not return valid values for vblanks, vtotal, hblanks, htotal for VGA outputs on all cards, but those are needed for scanout position queries. Testing on Linux-4.12-rc5 + drm-next on a GeForce 9500 GT (NV G96) with timing measurement equipment indicates this works fine, so allow immediate vblank disable for power saving. For debugging in case of unexpected trouble, booting with kernel cmdline option drm.vblankoffdelay=0 (or echo 0 > /sys/module/drm/parameters/vblankoffdelay) would keep vblank irqs permanently on to approximate old behavior. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/kms/nv50-: remove duplicate assignment	Ben Skeggs
	Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/kms/nv50-: fix drm-get-put.cocci warnings	kbuild test robot
	Use drm__get() and drm__put() helpers instead of drm__reference() and drm__unreference() helpers. Generated by: scripts/coccinelle/api/drm-get-put.cocci Fixes: 30ed49b55b6e ("drm/nouveau/kms/nv50-: move code underneath dispnv50/") Signed-off-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/disp/nv50-gp10x: fix coverity warning	Ben Skeggs
	Change values to u32, there's no need for them to be 64-bit. Reported-by: Colin Ian King <colin.king@canonical.com> Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/core: ERR_PTR vs NULL bug in nvkm_engine_info()	Ben Skeggs
	Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/mmu/gp10b: remove ghost file	Jérôme Glisse
	This ghost file have been haunting us. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/secboot/tegra: Enable gp20b/gp10b firmware tag when relevant	Nicolas Chauvet
	This allows to have the related MODULE_FIRMWARE tag only on relevant arch (arm64). This will saves about 400k on initramfs when not relevant Signed-off-by: Nicolas Chauvet <kwizart@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/fault/gv100: fix fault buffer initialisation	Ben Skeggs
	Not sure how this happened, it worked last time I tested it! Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-16	drm/nouveau/gr/gv100: handle multiple SM-per-TPC for shader exceptions	Ben Skeggs
	Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-07-14	drm/i915/guc: Disable rpm wakeref asserts in GuC irq handler	Michał Winiarski
	We're seeing "RPM wakelock ref not held during HW access" warning otherwise. Since IRQs are synced for runtime suspend we can just disable the wakeref asserts. Reported-by: Marta Löfstedt <marta.lofstedt@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105710 Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180714173703.7894-1-chris@chris-wilson.co.uk Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-07-13	drm/i915/execlists: Drop clear_gtiir() on GPU reset	Chris Wilson
	With the new CSB processing code, we are not vulnerable to delayed delivery of a pre-reset interrupt as we use the CSB status pointers in the HWSP to decide if we need to parse any CSB events and no longer need to wait for the first post-reset interrupt to be assured that the CSB mmio registers are valid. The new icl code to clear registers has a nasty lock inversion: [ 57.409776] ====================================================== [ 57.409779] WARNING: possible circular locking dependency detected [ 57.409783] 4.18.0-rc4-CI-CI_DII_1137+ #1 Tainted: G U W [ 57.409785] ------------------------------------------------------ [ 57.409788] swapper/6/0 is trying to acquire lock: [ 57.409790] 000000004f304ee5 (&engine->timeline.lock/1){-.-.}, at: execlists_submit_request+0x2b/0x1a0 [i915] [ 57.409841] but task is already holding lock: [ 57.409844] 00000000aad89594 (&(&rq->lock)->rlock#2){-.-.}, at: notify_ring+0x2b2/0x480 [i915] [ 57.409869] which lock already depends on the new lock. [ 57.409872] the existing dependency chain (in reverse order) is: [ 57.409876] -> #2 (&(&rq->lock)->rlock#2){-.-.}: [ 57.409900] notify_ring+0x2b2/0x480 [i915] [ 57.409922] gen8_cs_irq_handler+0x39/0xa0 [i915] [ 57.409943] gen11_irq_handler+0x2f0/0x420 [i915] [ 57.409949] __handle_irq_event_percpu+0x42/0x370 [ 57.409952] handle_irq_event_percpu+0x2b/0x70 [ 57.409956] handle_irq_event+0x2f/0x50 [ 57.409959] handle_edge_irq+0xe7/0x190 [ 57.409964] handle_irq+0x67/0x160 [ 57.409967] do_IRQ+0x5e/0x120 [ 57.409971] ret_from_intr+0x0/0x1d [ 57.409974] _raw_spin_unlock_irqrestore+0x4e/0x60 [ 57.409979] tasklet_action_common.isra.5+0x47/0xb0 [ 57.409982] __do_softirq+0xd9/0x505 [ 57.409985] irq_exit+0xa9/0xc0 [ 57.409988] do_IRQ+0x9a/0x120 [ 57.409991] ret_from_intr+0x0/0x1d [ 57.409995] cpuidle_enter_state+0xac/0x360 [ 57.409999] do_idle+0x1f3/0x250 [ 57.410004] cpu_startup_entry+0x6a/0x70 [ 57.410010] start_secondary+0x19d/0x1f0 [ 57.410015] secondary_startup_64+0xa5/0xb0 [ 57.410018] -> #1 (&(&dev_priv->irq_lock)->rlock){-.-.}: [ 57.410081] clear_gtiir+0x30/0x200 [i915] [ 57.410116] execlists_reset+0x6e/0x2b0 [i915] [ 57.410140] i915_reset_engine+0x111/0x190 [i915] [ 57.410165] i915_handle_error+0x11a/0x4a0 [i915] [ 57.410198] i915_hangcheck_elapsed+0x378/0x530 [i915] [ 57.410204] process_one_work+0x248/0x6c0 [ 57.410207] worker_thread+0x37/0x380 [ 57.410211] kthread+0x119/0x130 [ 57.410215] ret_from_fork+0x3a/0x50 [ 57.410217] -> #0 (&engine->timeline.lock/1){-.-.}: [ 57.410224] _raw_spin_lock_irqsave+0x33/0x50 [ 57.410256] execlists_submit_request+0x2b/0x1a0 [i915] [ 57.410289] submit_notify+0x8d/0x124 [i915] [ 57.410314] __i915_sw_fence_complete+0x81/0x250 [i915] [ 57.410339] dma_i915_sw_fence_wake+0xd/0x20 [i915] [ 57.410344] dma_fence_signal_locked+0x79/0x200 [ 57.410368] notify_ring+0x2ba/0x480 [i915] [ 57.410392] gen8_cs_irq_handler+0x39/0xa0 [i915] [ 57.410416] gen11_irq_handler+0x2f0/0x420 [i915] [ 57.410421] __handle_irq_event_percpu+0x42/0x370 [ 57.410425] handle_irq_event_percpu+0x2b/0x70 [ 57.410428] handle_irq_event+0x2f/0x50 [ 57.410432] handle_edge_irq+0xe7/0x190 [ 57.410436] handle_irq+0x67/0x160 [ 57.410439] do_IRQ+0x5e/0x120 [ 57.410445] ret_from_intr+0x0/0x1d [ 57.410449] cpuidle_enter_state+0xac/0x360 [ 57.410453] do_idle+0x1f3/0x250 [ 57.410456] cpu_startup_entry+0x6a/0x70 [ 57.410460] start_secondary+0x19d/0x1f0 [ 57.410464] secondary_startup_64+0xa5/0xb0 [ 57.410466] other info that might help us debug this: [ 57.410471] Chain exists of: &engine->timeline.lock/1 --> &(&dev_priv->irq_lock)->rlock --> &(&rq->lock)->rlock#2 [ 57.410481] Possible unsafe locking scenario: [ 57.410485] CPU0 CPU1 [ 57.410487] ---- ---- [ 57.410490] lock(&(&rq->lock)->rlock#2); [ 57.410494] lock(&(&dev_priv->irq_lock)->rlock); [ 57.410498] lock(&(&rq->lock)->rlock#2); [ 57.410503] lock(&engine->timeline.lock/1); [ 57.410506] * DEADLOCK * [ 57.410511] 4 locks held by swapper/6/0: [ 57.410514] #0: 0000000074575789 (&(&dev_priv->irq_lock)->rlock){-.-.}, at: gen11_irq_handler+0x8a/0x420 [i915] [ 57.410542] #1: 000000009b29b30e (rcu_read_lock){....}, at: notify_ring+0x1a/0x480 [i915] [ 57.410573] #2: 00000000aad89594 (&(&rq->lock)->rlock#2){-.-.}, at: notify_ring+0x2b2/0x480 [i915] [ 57.410601] #3: 000000009b29b30e (rcu_read_lock){....}, at: submit_notify+0x35/0x124 [i915] [ 57.410635] stack backtrace: [ 57.410640] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G U W 4.18.0-rc4-CI-CI_DII_1137+ #1 [ 57.410644] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.2222.A01.1805300339 05/30/2018 [ 57.410650] Call Trace: [ 57.410652] <IRQ> [ 57.410657] dump_stack+0x67/0x9b [ 57.410662] print_circular_bug.isra.16+0x1c8/0x2b0 [ 57.410666] __lock_acquire+0x1897/0x1b50 [ 57.410671] ? lock_acquire+0xa6/0x210 [ 57.410674] lock_acquire+0xa6/0x210 [ 57.410706] ? execlists_submit_request+0x2b/0x1a0 [i915] [ 57.410711] _raw_spin_lock_irqsave+0x33/0x50 [ 57.410741] ? execlists_submit_request+0x2b/0x1a0 [i915] [ 57.410769] execlists_submit_request+0x2b/0x1a0 [i915] [ 57.410774] ? _raw_spin_unlock_irqrestore+0x39/0x60 [ 57.410804] submit_notify+0x8d/0x124 [i915] [ 57.410828] __i915_sw_fence_complete+0x81/0x250 [i915] [ 57.410854] dma_i915_sw_fence_wake+0xd/0x20 [i915] [ 57.410858] dma_fence_signal_locked+0x79/0x200 [ 57.410882] notify_ring+0x2ba/0x480 [i915] [ 57.410907] gen8_cs_irq_handler+0x39/0xa0 [i915] [ 57.410933] gen11_irq_handler+0x2f0/0x420 [i915] [ 57.410938] __handle_irq_event_percpu+0x42/0x370 [ 57.410943] handle_irq_event_percpu+0x2b/0x70 [ 57.410947] handle_irq_event+0x2f/0x50 [ 57.410951] handle_edge_irq+0xe7/0x190 [ 57.410955] handle_irq+0x67/0x160 [ 57.410958] do_IRQ+0x5e/0x120 [ 57.410962] common_interrupt+0xf/0xf [ 57.410965] </IRQ> [ 57.410969] RIP: 0010:cpuidle_enter_state+0xac/0x360 [ 57.410972] Code: 44 00 00 31 ff e8 84 93 91 ff 45 84 f6 74 12 9c 58 f6 c4 02 0f 85 31 02 00 00 31 ff e8 7d 30 98 ff e8 e8 0e 94 ff fb 4c 29 fb <48> ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7 ea b8 ff [ 57.411015] RSP: 0018:ffffc90000133e90 EFLAGS: 00000216 ORIG_RAX: ffffffffffffffdd [ 57.411023] RAX: ffff8804ae748040 RBX: 000000000002a97d RCX: 0000000000000000 [ 57.411029] RDX: 0000000000000046 RSI: ffffffff82141263 RDI: ffffffff820f05a7 [ 57.411035] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000 [ 57.411041] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8229f078 [ 57.411045] R13: ffff8804ab2adfa8 R14: 0000000000000000 R15: 0000000d5de092e3 [ 57.411052] do_idle+0x1f3/0x250 [ 57.411055] cpu_startup_entry+0x6a/0x70 [ 57.411059] start_secondary+0x19d/0x1f0 [ 57.411064] secondary_startup_64+0xa5/0xb0 The easiest remedy is to remove the defunct code. Fixes: ff047a87cfac ("drm/i915/icl: Correctly clear lost ctx-switch interrupts across reset for Gen11") References: fd8526e50902 ("drm/i915/execlists: Trust the CSB") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180713203529.1973-3-chris@chris-wilson.co.uk
2018-07-13	drm/i915: Do not short-circuit tasklets during reset	Chris Wilson
	Inside intel_engine_is_idle(), we flush the tasklet to ensure that is being run in a timely fashion (ksoftirqd has taught us to expect the worst). However, if we are in the middle of reset, the HW may not yet be ready to execute the submission tasklet and so we must respect the disable flag. Fixes: dd0cf235d81f ("drm/i915: Speed up idle detection by kicking the tasklets") Testcase: igt/drv_selftest/live_hangcheck Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180713203529.1973-2-chris@chris-wilson.co.uk
2018-07-13	drm/i915/selftests: Include the start of each subtest in the GEM trace	Chris Wilson
	Knowing the boundary of each subtest can be instrumental in digesting the voluminous trace output and finding the critical piece of information. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180713203529.1973-1-chris@chris-wilson.co.uk
2018-07-13	drm/amdgpu/pp/smu7: cache smu firmware toc	Alex Deucher
	Rather than calculating it everytime we rebuild the toc buffer, calculate it once initially and then just copy the cached results to the vram buffer. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amdgpu/pp/smu7: remove local mc_addr variable	Alex Deucher
	use the structure member directly. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amdgpu/pp/smu7: drop unused values in smu data structure	Alex Deucher
	use kaddr directly rather than secondary variable. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amdgpu/pp/smu7: use a local variable for toc indexing	Alex Deucher
	Rather than using the index variable stored in vram. If the device fails to come back online after a resume cycle, reads from vram will return all 1s which will cause a segfault. Based on a patch from Thomas Martitz <kugel@rockbox.org>. This avoids the segfault, but we still need to sort out why the GPU does not come back online after a resume. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=105760 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amdgpu/vi: fix mixed up state in smu clockgating setup	Alex Deucher
	Use the PP_STATE_SUPPORT_* rather than AMD_CG_SUPPORT_* when communicating with the SMU. Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amd/display: properly turn autocal off	Dmytro Laktyushkin
	[why] Currently we do not turn off autocal when scaling is in bypass. In case vbios enalbes auto scale and our first mode set is a non-scaled mode we have autocal on causing screen corruption. [how] moves turning autocal off to be first thing done during scaler setup Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amd/display: Initialize data structure for DalMpVisualConfirm.	Hugo Hu
	[Why] Prevent unexpected color shows if DalMpVisualConfirm enable. [How] Zero out color configuration data for DalMpVisualConfirm when initiating. Signed-off-by: Hugo Hu <hugo.hu@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amd/display: dal 3.1.55	Tony Cheng
	Signed-off-by: Tony Cheng <tony.cheng@amd.com> Reviewed-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amd/display: update dml to match DV dml	Dmytro Laktyushkin
	DV updated their dml with an option to use max vstartup, this updates dc dml with the same option Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amd/display: add max scl ratio to soc bounding box	Dmytro Laktyushkin
	Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amd/display: Fix new stream count check in dc_add_stream_to_ctx	Ken Chalmers
	[Why] The previous code could allow through attempts to enable more streams than there are timing generators, in designs where the number of pipes is greater than the number of timing generators. [How] Compare the new stream count to the resource pool's timing generator count, instead of its pipe count. Also correct a typo in the error message. Signed-off-by: Ken Chalmers <ken.chalmers@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amd/display: dp debugfs allow link rate lane count greater than dp rx ↵	Hersen Wu
	reported caps [Why] when hw team does phy parameters tuning, there is need to force dp link rate or lane count grater than the values from dp receiver to check dp tx. current debufs limit link rate, lane count no more than rx caps. [How] remove force settings less than rx caps check v2: Fix typo in title Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amd/display: Expose couple OPTC functions through header	David Francis
	Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amd/display: Add CRC support for DCN	David Francis
	[Why] Regamma/CTM tests require CRC support [How] The CRC registers that were used in DCE exist under different names in DCN. The code was copied from DCE (in dc/dce110/dce110_timing_generator.c) into DCN, and changed to use the DCN register access helper functions. Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13	drm/amd/display: Return out_link_loss from interrupt handler	Fatemeh Darbehani
	Signed-off-by: Fatemeh Darbehani <fatemeh.darbehani@amd.com> Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>