Age | Commit message (Collapse) | Author |
|
The basic problem here is that it's not allowed to page fault while
holding the reservation lock.
So it can happen that multiple processes try to validate an userptr
at the same time.
Work around that by putting the HMM range object into the mutex
protected bo list for now.
v2: make sure range is set to NULL in case of an error
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
CC: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Since switching to HMM we always need that because we no longer grab
references to the pages.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
CC: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Now that we've fixed the issue with using the incorrect topology manager,
we're actually grabbing the topology manager's lock - and consequently
deadlocking. Luckily for us though, there's actually nothing in AMD's DSC
state computation code that really should need this lock. The one exception
is the mutex_lock() in dm_dp_mst_is_port_support_mode(), however we grab no
locks beneath &mgr->lock there so that should be fine to leave be.
Gitlab issue: https://gitlab.freedesktop.org/drm/amd/-/issues/2171
Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 8c20a1ed9b4f ("drm/amd/display: MST DSC compute fair share")
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This bug hurt me. Basically, it appears that we've been grabbing the
entirely wrong mutex in the MST DSC computation code for amdgpu! While
we've been grabbing:
amdgpu_dm_connector->mst_mgr
That's zero-initialized memory, because the only connectors we'll ever
actually be doing DSC computations for are MST ports. Which have mst_mgr
zero-initialized, and instead have the correct topology mgr pointer located
at:
amdgpu_dm_connector->mst_port->mgr;
I'm a bit impressed that until now, this code has managed not to crash
anyone's systems! It does seem to cause a warning in LOCKDEP though:
[ 66.637670] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
This was causing the problems that appeared to have been introduced by:
commit 4d07b0bc4034 ("drm/display/dp_mst: Move all payload info into the atomic state")
This wasn't actually where they came from though. Presumably, before the
only thing we were doing with the topology mgr pointer was attempting to
grab mst_mgr->lock. Since the above commit however, we grab much more
information from mst_mgr including the atomic MST state and respective
modesetting locks.
This patch also implies that up until now, it's quite likely we could be
susceptible to race conditions when going through the MST topology state
for DSC computations since we technically will not have grabbed any lock
when going through it.
So, let's fix this by adjusting all the respective code paths to look at
the right pointer and skip things that aren't actual MST connectors from a
topology.
Gitlab issue: https://gitlab.freedesktop.org/drm/amd/-/issues/2171
Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 8c20a1ed9b4f ("drm/amd/display: MST DSC compute fair share")
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Looks like that we're accidentally dropping a pretty important return code
here. For some reason, we just return -EINVAL if we fail to get the MST
topology state. This is wrong: error codes are important and should never
be squashed without being handled, which here seems to have the potential
to cause a deadlock.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Fixes: 8ec046716ca8 ("drm/dp_mst: Add helper to trigger modeset on affected DSC MST CRTCs")
Cc: <stable@vger.kernel.org> # v5.6+
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
It appears that amdgpu makes the mistake of completely ignoring the return
values from the DP MST helpers, and instead just returns a simple
true/false. In this case, it seems to have come back to bite us because as
a result of simply returning false from
compute_mst_dsc_configs_for_state(), amdgpu had no way of telling when a
deadlock happened from these helpers. This could definitely result in some
kernel splats.
V2:
* Address Wayne's comments (fix another bunch of spots where we weren't
passing down return codes)
Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 8c20a1ed9b4f ("drm/amd/display: MST DSC compute fair share")
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Assert on non-OK response from SMU is unnecessary.
It was replaced with respective log message on other asics
in the past with commit:
"drm/amd/display: Removing assert statements for Linux"
[How]
Remove assert and add dbg logging as on other DCNs.
Signed-off-by: Roman Li <roman.li@amd.com>
Reviewed-by: Saaem Rizvi <SyedSaaem.Rizvi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Move the new callback outside of the guard.
Fixes: dc55b106ad47 ("drm/amd/display: Disable phantom OTG after enable for plane disable")
CC: Alvin Lee <Alvin.Lee2@amd.com>
CC: Alan Liu <HaoPing.Liu@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
There is no reason to use page based mappings, as the established
mappings are special driver mappings anyways and should not be
handled like normal pages.
Be consistent with what other drivers do and use raw PFN based
mappings.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
The GPU is found on the NXP i.MX8MN SoC. The feature bits are taken from
the NXP downstream kernel driver 6.4.3.p2.
Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
|
|
For the GSC engine we only want to capture the instance regs.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221111001832.4144910-1-daniele.ceraolospurio@intel.com
|
|
Previously, we only used PXP FW interface version-42 structures for
PXP arbitration session on ADL/TGL products and version-43 for HuC
authentication on DG2. That worked fine despite not differentiating such
versioning of the PXP firmware interaction structures. This was okay
back then because the only commands used via version 42 was not
used via version 43 and vice versa.
With MTL, we'll need both these versions side by side for the same
commands (PXP-session) with the older platform feature support. That
said, let's create separate files to define the structures and definitions
for both version-42 and 43 of PXP FW interfaces.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108045628.4187260-2-alan.previn.teres.alexis@intel.com
|
|
The MODULE_LICENSE macro is missing from the kunit helpers file, thus
leading to a build error.
Let's introduce it along with MODULE_AUTHOR.
Fixes: 44a3928324e9 ("drm/tests: Add Kunit Helpers")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Maíra Canal <mairacanal@riseup.net>
Link: https://lore.kernel.org/r/20221116091712.1309651-2-maxime@cerno.tech
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
|
|
The kunit helpers code weren't including its header, leading to a
warning that no previous prototype had been defined for public
functions.
Include the matching header to fix the warning.
Fixes: 44a3928324e9 ("drm/tests: Add Kunit Helpers")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Maíra Canal <mairacanal@riseup.net>
Link: https://lore.kernel.org/r/20221116091712.1309651-1-maxime@cerno.tech
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
|
|
We've used a temporary platform device for firmware EDID loading since
it was introduced in commit da0df92b5731 ("drm: allow loading an EDID as
firmware to override broken monitor"), but there's no explanation why.
Using a temporary device does not play well with CONFIG_FW_CACHE=y,
which caches firmware images (e.g. on suspend) so that drivers can
request firmware when the system is not ready for it, and return the
images from the cache (e.g. during resume). This works automatically for
regular devices, but obviously not for a temporarily created device.
Stop using the throwaway platform device, and use the drm device
instead.
Note that this may still be problematic for cases where the display was
plugged in during suspend, and the firmware wasn't loaded and therefore
not cached before suspend.
References: https://lore.kernel.org/r/20220727074152.43059-1-matthieu.charette@gmail.com
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2061
Reported-by: Matthieu CHARETTE <matthieu.charette@gmail.com>
Tested-by: Matthieu CHARETTE <matthieu.charette@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20221114111709.434979-1-jani.nikula@intel.com
|
|
Support the kernel's nomodeset parameter for all PCI-based fbdev
drivers that use aperture helpers to remove other, hardware-agnostic
graphics drivers.
The parameter is a simple way of using the firmware-provided scanout
buffer if the hardware's native driver is broken. The same effect
could be achieved with per-driver options, but the importance of the
graphics output for many users makes a single, unified approach
worthwhile.
With nomodeset specified, the fbdev driver module will not load. This
unifies behavior with similar DRM drivers. In DRM helpers, modules
first check the nomodeset parameter before registering the PCI
driver. As fbdev has no such module helpers, we have to modify each
driver individually.
The name 'nomodeset' is slightly misleading, but has been chosen for
historical reasons. Several drivers implemented it before it became a
general option for DRM. So keeping the existing name was preferred over
introducing a new one.
v2:
* print a warning if a driver does not init (Helge)
* wrap video_firmware_drivers_only() in helper
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221111133024.9897-3-tzimmermann@suse.de
|
|
Move the nomodeset kernel parameter to drivers/video to make it
available to non-DRM drivers. Adapt the interface, but keep the DRM
interface drm_firmware_drivers_only() to avoid churn within DRM. The
function should later be inlined into callers.
The parameter disables any DRM graphics driver that would replace a
driver for firmware-provided scanout buffers. It is an option to easily
fallback to basic graphics output if the hardware's native driver is
broken. Moving it to a more prominent location wil make it available
to fbdev as well.
v2:
* clarify the meaning of the nomodeset parameter (Javier)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221111133024.9897-2-tzimmermann@suse.de
|
|
The fbdev damage worker is unused, so remove it.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115115819.23088-7-tzimmermann@suse.de
|
|
Schedule the deferred-I/O worker instead of the damage worker after
writing to the fbdev framebuffer. The deferred-I/O worker then performs
the dirty-fb update. The fbdev emulation will initialize deferred I/O
for all drivers that require damage updates. It is therefore a valid
assumption that the deferred-I/O worker is present.
It would be possible to perform the damage handling directly from within
the write operation. But doing this could increase the overhead of the
write or interfere with a concurrently scheduled deferred-I/O worker.
Instead, scheduling the deferred-I/O worker with its regular delay of
50 ms removes load off the write operation and allows the deferred-I/O
worker to handle multiple write operations that arrived during the delay
time window.
v3:
* remove unused variable (lkp)
v2:
* keep drm_fb_helper_damage() (Daniel)
* use fb_deferred_io_schedule_flush() (Daniel)
* clarify comments (Daniel)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115115819.23088-6-tzimmermann@suse.de
|
|
Call fb_dirty directly from drm_fb_helper_deferred_io() to avoid the
latency of running the damage worker.
The deferred-I/O helper drm_fb_helper_deferred_io() runs in a worker
thread at regular intervals as part of writing to mmaped framebuffer
memory. It used to schedule the fbdev damage worker to flush the
framebuffer. Changing this to flushing the framebuffer directly avoids
the latency introduced by the damage worker.
v2:
* remove fb_dirty from defio in separate patch (Daniel)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115115819.23088-5-tzimmermann@suse.de
|
|
The helper for processing deferred I/O on pages has no dependency on
the fb_dirty damge-handling callback; so remove the test. In practice,
deferred I/O is only used with damage handling and the damage worker
already guarantees the presence of the fb_dirty callback.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115115819.23088-4-tzimmermann@suse.de
|
|
Move the dirty-fb update from the damage-worker callback into the
new helper drm_fb_helper_fb_dirty(), so that it can run outside the
damage worker. This change will help to remove the damage worker
entirely. No functional changes.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115115819.23088-3-tzimmermann@suse.de
|
|
Set the damage area in the new helper drm_fb_helper_add_damage_clip().
It can now be updated without scheduling the damage worker. This change
will help to remove the damage worker entirely. No functional changes.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115115819.23088-2-tzimmermann@suse.de
|
|
These formats are not subsampled, but that means hsub and vsub should be
1, not 0.
Fixes: 94b292b27734 ("drm: drm_fourcc: add NV15, Q410, Q401 YUV formats")
Reported-by: George Kennedy <george.kennedy@oracle.com>
Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Signed-off-by: Brian Starkey <brian.starkey@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220913144306.17279-1-brian.starkey@arm.com
|
|
drm_mode_config_init() simply calls drmm_mode_config_init(), hence
cleanup is automatically handled through registering
drm_mode_config_cleanup() with drmm_add_action_or_reset().
While at it, get rid of the deprecated drm_mode_config_init() and
replace it with drmm_mode_config_init() directly.
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026155934.125294-6-dakr@redhat.com
|
|
Use drm managed resource allocation (drmm_universal_plane_alloc()) in
order to get rid of the explicit destroy hook in struct drm_plane_funcs.
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026155934.125294-5-dakr@redhat.com
|
|
Use drmm_crtc_init_with_planes() instead of drm_crtc_init_with_planes()
to get rid of the explicit destroy hook in struct drm_plane_funcs.
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026155934.125294-4-dakr@redhat.com
|
|
Using drm_device->dev_private is deprecated. Since we've switched to
devm_drm_dev_alloc(), struct drm_device is now embedded in struct
malidp_drm, hence we can use container_of() to get the struct drm_device
instance instead.
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026155934.125294-3-dakr@redhat.com
|
|
Use drm managed resources to allocate driver structures and get rid of
the deprecated drm_dev_alloc() call and replace it with
devm_drm_dev_alloc().
This also serves as preparation to get rid of drm_device->dev_private
and to fix use-after-free issues on driver unload.
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221026155934.125294-2-dakr@redhat.com
|
|
In i915_gem_madvise_ioctl() we immediately purge the object is not
currently used, like when the mm.pages are NULL. With shmem the pages
might still be hanging around or are perhaps swapped out. Similarly with
ttm we might still have the pages hanging around on the ttm resource,
like with lmem or shmem, but here we need to be extra careful since
async unbinds are possible as well as in-progress kernel moves. In
i915_ttm_purge() we expect the pipeline-gutting to nuke the ttm resource
for us, however if it's busy the memory is only moved to a ghost object,
which then leads to broken behaviour when for example clearing the
i915_tt->filp, since the actual ttm_tt is still alive and populated,
even though it's been moved to the ghost object. When we later destroy
the ghost object we hit the following, since the filp is now NULL:
[ +0.006982] #PF: supervisor read access in kernel mode
[ +0.005149] #PF: error_code(0x0000) - not-present page
[ +0.005147] PGD 11631d067 P4D 11631d067 PUD 115972067 PMD 0
[ +0.005676] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ +0.012962] Workqueue: events ttm_device_delayed_workqueue [ttm]
[ +0.006022] RIP: 0010:i915_ttm_tt_unpopulate+0x3a/0x70 [i915]
[ +0.005879] Code: 89 fb 48 85 f6 74 11 8b 55 4c 48 8b 7d 30 45 31 c0 31 c9 e8 18 6a e5 e0 80 7d 60 00 74 20 48 8b 45 68
8b 55 08 4c 89 e7 5b 5d <48> 8b 40 20 83 e2 01 41 5c 89 d1 48 8b 70
30 e9 42 b2 ff ff 4c 89
[ +0.018782] RSP: 0000:ffffc9000bf6fd70 EFLAGS: 00010202
[ +0.005244] RAX: 0000000000000000 RBX: ffff8883e12ae380 RCX: 0000000000000000
[ +0.007150] RDX: 000000008000000e RSI: ffffffff823559b4 RDI: ffff8883e12ae3c0
[ +0.007142] RBP: ffff888103b65d48 R08: 0000000000000001 R09: 0000000000000001
[ +0.007144] R10: 0000000000000001 R11: ffff88829c2c8040 R12: ffff8883e12ae3c0
[ +0.007148] R13: 0000000000000001 R14: ffff888115184140 R15: ffff888115184248
[ +0.007154] FS: 0000000000000000(0000) GS:ffff88844db00000(0000) knlGS:0000000000000000
[ +0.008108] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.005763] CR2: 0000000000000020 CR3: 000000013fdb4004 CR4: 00000000003706e0
[ +0.007152] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ +0.007145] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ +0.007154] Call Trace:
[ +0.002459] <TASK>
[ +0.002126] ttm_tt_unpopulate.part.0+0x17/0x70 [ttm]
[ +0.005068] ttm_bo_tt_destroy+0x1c/0x50 [ttm]
[ +0.004464] ttm_bo_cleanup_memtype_use+0x25/0x40 [ttm]
[ +0.005244] ttm_bo_cleanup_refs+0x90/0x2c0 [ttm]
[ +0.004721] ttm_bo_delayed_delete+0x235/0x250 [ttm]
[ +0.004981] ttm_device_delayed_workqueue+0x13/0x40 [ttm]
[ +0.005422] process_one_work+0x248/0x560
[ +0.004028] worker_thread+0x4b/0x390
[ +0.003682] ? process_one_work+0x560/0x560
[ +0.004199] kthread+0xeb/0x120
[ +0.003163] ? kthread_complete_and_exit+0x20/0x20
[ +0.004815] ret_from_fork+0x1f/0x30
v2:
- Just use ttm_bo_wait() directly (Niranjana)
- Add testcase reference
Testcase: igt@gem_madvise@dontneed-evict-race
Fixes: 213d50927763 ("drm/i915/ttm: Introduce a TTM i915 gem object backend")
Reported-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v5.15+
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Acked-by: Nirmoy Das <Nirmoy.Das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115104620.120432-1-matthew.auld@intel.com
|
|
vm_fault_ttm() should not expect ttm ghost obj so remove that check.
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221024144558.27747-1-nirmoy.das@intel.com
|
|
All calls to i915_vma_move_to_active are surrounded by vma lock
and/or there are multiple local helpers for it in particular tests.
Let's replace it by common helper.
The patch should not introduce functional changes.
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221019215906.295296-3-andrzej.hajda@intel.com
|
|
Since almost all calls to i915_vma_move_to_active are prepended with
i915_request_await_object, let's call the latter from
_i915_vma_move_to_active by default and add flag allowing bypassing it.
Adjust all callers accordingly.
The patch should not introduce functional changes.
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221019215906.295296-2-andrzej.hajda@intel.com
|
|
There were several updates in the driver on how the workarounds are
handled since its documentation was written. Update the documentation to
reflect the current reality.
v2:
- Remove footnote that was wrongly referenced, adding back the
reference in the correct paragraph.
- Remove "Display workarounds" and just mention "display IP" under
"Other" category since all of them are peppered around the driver.
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> # v1
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115192611.179981-1-lucas.demarchi@intel.com
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 6.2:
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- atomic-helper: Add begin_fb_access and end_fb_access hooks
- fb-helper: Rework to move fb emulation into helpers
- scheduler: rework entity flush, kill and fini
- ttm: Optimize pool allocations
Driver Changes:
- amdgpu: scheduler rework
- hdlcd: Switch to DRM-managed resources
- ingenic: Fix registration error path
- lcdif: FIFO threshold tuning
- meson: Fix return type of cvbs' mode_valid
- ofdrm: multiple fixes (kconfig, types, endianness)
- sun4i: A100 and D1 support
- panel:
- New Panel: Jadard JD9365DA-H3
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20221110083612.g63eaocoaa554soh@houat
|
|
The state of VRAM is unreliable due to a PCI event like AER, link reset
or DPC.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Re-submitting IBs by the kernel has many problems because pre-
requisite state is not automatically re-created as well. In
other words neither binary semaphores nor things like ring
buffer pointers are in the state they should be when the
hardware starts to work on the IBs again.
Additional to that even after more than 5 years of
developing this feature it is still not stable and we have
massively problems getting the reference counts right.
As discussed with user space developers this behavior is not
helpful in the first place. For graphics and multimedia
workloads it makes much more sense to either completely
re-create the context or at least re-submitting the IBs
from userspace.
For compute use cases re-submitting is also not very
helpful since userspace must rely on the accuracy of
the result.
Because of this we stop this practice and instead just
properly note that the fence submission was canceled. The
only use case we keep the re-submission for now is SRIOV
and function level resets.
v2: as suggested by Sshaoyun stop resubmitting jobs even for SRIOV
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This reverts commit e6c6338f393b74ac0b303d567bb918b44ae7ad75.
This feature basically re-submits one job after another to
figure out which one was the one causing a hang.
This is obviously incompatible with gang-submit which requires
that multiple jobs run at the same time. It's also absolutely
not helpful to crash the hardware multiple times if a clean
recovery is desired.
For testing and debugging environments we should rather disable
recovery alltogether to be able to inspect the state with a hw
debugger.
Additional to that the sw implementation is clearly buggy and causes
reference count issues for the hardware fence.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In the PP_OD_EDIT_VDDC_CURVE case the "input_index" variable is capped at
2 but not checked for negative values so it results in an out of bounds
read. This value comes from the user via sysfs.
Fixes: d5bf26539494 ("drm/amd/powerplay: added vega20 overdrive support V3")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
One-element arrays are deprecated, and we are replacing them with
flexible array members instead. So, replace one-element array with
flexible-array member in structs ATOM_I2C_VOLTAGE_OBJECT_V3,
ATOM_ASIC_INTERNAL_SS_INFO_V2, ATOM_ASIC_INTERNAL_SS_INFO_V3,
and refactor the rest of the code accordingly.
Important to mention is that doing a build before/after this patch
results in no functional binary output differences.
This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].
Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/238
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 [1]
Signed-off-by: Paulo Miguel Almeida <paulo.miguel.almeida.rodenas@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The workaround designed for some specific ASICs is wrongly applied
to SMU13 ASICs. That leads to some runpm hang.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Enable SMU13.0.7 runpm support.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Enable SMU13.0.0 runpm support.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
recovery is completed
Amdgpu_ras_set_error_query_ready is called at the start of
amdgpu_device_gpu_recover to disable query ras error, but the
code behind only enables query ras error in full reset path,
but not in soft reset path, emergency restart path and skip
the hardware reset path.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If we enable virtual display functionality on parts with
no display hardware we can end up trying to check for and
reserve the vbios FB area on devices where it doesn't exist.
Check if display hardware is actually present on the hardware
before trying to reserve the memory.
v2: move the check into common code
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
It is to resolve a regression, which fails to allocate
VRAM due to no free memory in application, the reason
is we add check of vram_pin_size for memory limit, and
application is pinning the memory for Peerdirect, KFD
should not count it in memory limit. So removing
vram_pin_size will resolve it.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Otherwise, some unexpected PCIE AER errors will be observed
in runtime suspend/resume cycle.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add umc channel index mapping table for umc_v8_10.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
change the vangogh family to use IP discovery path to initialize IP
list, this needs to remove the DID from the PCI ID list to allow the IP
discovery path to set all the IP versions correctly.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use ip versions (10,3,1) to match the GPU after Vangogh switched to use IP
discovery path.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|