Age | Commit message (Collapse) | Author |
|
I just landed the fence deadline PR from Rob that a bunch of drivers
want/need to apply driver-specific patches. Backmerge -rc4 so that
they don't have to be stuck on -rc2 for no reason at all.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
Driver Changes:
- Fix issue #6333: "list_add corruption" and full system lockup from
performance monitoring (Janusz)
- Give the punit time to settle before fatally failing (Aravind, Chris)
- Don't use stolen memory or BAR for ring buffers on LLC platforms (John)
- Add missing ecodes and correct timeline seqno on GuC error captures (John)
- Make sure DSM size has correct 1MiB granularity on Gen12+ (Nirmoy,
Lucas)
- Fix potential SSEU max_subslices array-index-out-of-bounds access on Gen11 (Andrea)
- Whitelist COMMON_SLICE_CHICKEN3 for UMD access on Gen12+ (Matt R.)
- Apply Wa_1408615072/Wa_1407596294 correctly on Gen11 (Matt R)
- Apply LNCF/LBCF workarounds correctly on XeHP SDV/PVC/DG2 (Matt R)
- Implement Wa_1606376872 for Xe_LP (Gustavo)
- Consider GSI offset when doing MCR lookups on Meteorlake+ (Matt R.)
- Add engine TLB invalidation for Meteorlake (Matt R.)
- Fix GSC Driver-FLR completion on Meteorlake (Alan)
- Fix GSC races on driver load/unload on Meteorlake+ (Daniele)
- Disable MC6 for MTL A step (Badal)
- Consolidate TLB invalidation flow (Tvrtko)
- Improve debug GuC/HuC debug messages (Michal Wa., John)
- Move fd_install after last use of fence (Rob)
- Initialize the obj flags for shmem objects (Aravind)
- Fix missing debug object activation (Nirmoy)
- Probe lmem before the stolen portion (Matt A)
- Improve clean up of GuC busyness stats worker (John)
- Fix missing return code checks in GuC submission init (John)
- Annotate two more workaround/tuning registers as MCR on PVC (Matt R)
- Fix GEN8_MISCCPCTL definition and remove unused INF_UNIT_LEVEL_CLKGATE (Lucas)
- Use sysfs_emit() and sysfs_emit_at() (Nirmoy)
- Make kobj_type structures constant (Thomas W.)
- make kobj attributes const on gt/ (Jani)
- Remove the unused virtualized start hack on buddy allocator (Matt A)
- Remove redundant check for DG1 (Lucas)
- Move DG2 tuning to the right function (Lucas)
- Rename dev_priv to i915 for private data naming consistency in gt/ (Andi)
- Remove unnecessary whitelisting of CS_CTX_TIMESTAMP on Xe_HP platforms (Matt R.)
-
- Escape wildcard in method names in kerneldoc (Bagas)
- Selftest improvements (Chris, Jonathan, Tvrtko, Anshuman, Tejas)
- Fix sparse warnings (Jani)
[airlied: fix unused variable in intel_workarounds]
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZBMSb42yjjzczRhj@jlahtine-mobl.ger.corp.intel.com
|
|
Probe pseudo errors should be injected only in places where real errors
can be encountered, otherwise unwinding code can be broken.
Placing intel_uc_init_late before i915_inject_probe_error violated
this rule, resulting in following bug:
__intel_gt_disable:655 GEM_BUG_ON(intel_gt_pm_is_awake(gt))
Fixes: 481d458caede ("drm/i915/guc: Add golden context to GuC ADS")
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230314151920.1065847-1-andrzej.hajda@intel.com
(cherry picked from commit c4252a11131c7f27a158294241466e2a4e7ff94e)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
If we unload the driver and wedge before the GSC worker is complete,
the worker will hit an error on its submission to the GSC engine and
then exit. This is hard to hit for a user, but it is reproducible
with skipping selftests. The error is handled gracefully by the
worker, so there are no functional issues, but we still end up with
an error message in dmesg, which is something we want to avoid as
this is a supported scenario. We could modify the worker to better
handle a wedging occurring during its execution, but that gets
complicated for a couple of reasons:
- We do want the error on runtime wedging, because there are
implications for subsystems outside of GT (i.e., PXP, HDCP), it's
only the error on driver unload that we want to silence.
- The worker is responsible for multiple submissions (GSC FW load,
HuC auth, SW proxy), so all of those will have to be adapted to
handle the wedged_on_fini scenario.
Therefore, it's much simpler to just wait for the worker to be done
before wedging on driver removal, also considering that the worker
will likely already be idle in the great majority of non-selftest
scenarios.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230223172120.3304293-2-daniele.ceraolospurio@intel.com
|
|
As intel_pm.[ch] used to contain much more, intel_pm.h was included in a
lot of places. Many of them are now unnecessary. Remove.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ab9a7147b0cd63d95b9f27ed40615b9c9be18f84.1677678803.git.jani.nikula@intel.com
|
|
As the logic for selecting the register and corresponsing values grew, the
code become a bit unsightly. Consolidate by storing the required values at
engine init time in the engine itself, and by doing so minimise the amount
of invariant platform and engine checks during each and every TLB
invalidation.
v2:
* Fail engine probe if TLB invlidations registers are unknown.
v3:
* Rebase.
v4:
* Fix handling of GEN8_M2TCR. (Andrzej)
v5:
* Tidy checkpatch warnings.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> # v1
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230216092123.159085-1-tvrtko.ursulin@linux.intel.com
|
|
Check that we invalidate the TLB cache, the updated physical addresses
are immediately visible to the HW, and there is no retention of the old
physical address for concurrent HW access.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[ahajda: adjust to upstream driver, v2+]
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
[tursulin: Small indentation fix.]
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230130165058.1647414-1-andrzej.hajda@intel.com
|
|
When trying to analyse bug reports from CI, customers, etc. it can be
difficult to work out exactly what is happening on which GT in a
multi-GT system. So add GT oriented debug/error message wrappers. If
used instead of the drm_ equivalents, you get the same output but with
a GT# prefix on it.
v2: Go back to using lower case names (combined review feedback).
Convert intel_gt.c as a first step.
v3: Add gt_err_ratelimited() as well, undo one conversation that might
not have a GT pointer in some scenarios (review feedback from Michal W).
Split definitions into separate header (review feedback from Jani).
Convert all intel_gt*.c files.
v4: Re-order some macro definitions (Andi S), update (c) date (Tvrtko)
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230111200429.2139084-2-John.C.Harrison@Intel.com
|
|
Revert to the original explicit approach and document the reasoning
behind it.
v2:
* DG2 needs to be covered too. (Matt)
v3:
* Full version check for Gen12 to avoid catching all future platforms.
(Matt)
v4:
* Be totally explicit on the Gen12 branch. (Andrzej)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> # v1
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230110113533.744436-1-tvrtko.ursulin@linux.intel.com
|
|
Sync after v6.2-rc1 landed in drm-next.
We need to get some dependencies in place before we can merge
the fixes series from Gwan-gyeong and Chris.
References: https://lore.kernel.org/all/Y6x5JCDnh2rvh4lA@intel.com/
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
In case of Gen12.50 video and compute engines, TLB_INV registers are
masked - to modify one bit, corresponding bit in upper half of the register
must be enabled, otherwise nothing happens.
Fixes: 77fa9efc16a9 ("drm/i915/xehp: Create separate reg definitions for new MCR registers")
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221214075439.402485-1-andrzej.hajda@intel.com
|
|
Pull drm updates from Dave Airlie:
"The biggest highlight is that the accel subsystem framework is merged.
Hopefully for 6.3 we will be able to line up a driver to use it.
In drivers land, i915 enables DG2 support by default now, and nouveau
has a big stability refactoring and initial ampere support, AMD
includes new hw IP support and should build on ARM again. There is
also an ofdrm driver to take over offb on platforms it's used.
Stuff outside my tree, the dma-buf patches hit a few places, the vc4
firmware changes also do, and i915 has some interactions with MEI for
discrete GPUs. I think all of those should have been acked/reviewed by
relevant parties.
New driver:
- ofdrm - replacement for offb
fbdev:
- add support for nomodeset
fourcc:
- add Vivante tiled modifier
core:
- atomic-helpers: CRTC primary plane test fixes, fb access hooks
- connector: TV API consistency, cmdline parser improvements
- send connector hotplug on cleanup
- sort makefile objects
tests:
- sort kunit tests
- improve DP-MST tests
- add kunit helpers to create a device
sched:
- module param for scheduling policy
- refcounting fix
buddy:
- add back random seed log
ttm:
- convert ttm_resource to size_t
- optimize pool allocations
edid:
- HFVSDB parsing support fixes
- logging/debug improvements
- DSC quirks
dma-buf:
- Add unlocked vmap and attachment mapping
- move drivers to common locking convention
- locking improvements
firmware:
- new API for rPI firmware and vc4
xilinx:
- zynqmp: displayport bridge support
- dpsub fix
bridge:
- adv7533: Remove dynamic lane switching
- it6505: Runtime PM support, sync improvements
- ps8640: Handle AUX defer messages
- tc358775: Drop soft-reset over I2C
panel:
- panel-edp: Add INX N116BGE-EA2 C2 and C4 support.
- Jadard JD9365DA-H3
- NewVision NV3051D
amdgpu:
- DCN support on ARM
- DCN 2.1 secure display
- Sienna Cichlid mode2 reset fixes
- new GC 11.x firmware versions
- drop AMD specific DSC workarounds in favour of drm code
- clang warning fixes
- scheduler rework
- SR-IOV fixes
- GPUVM locking fixes
- fix memory leak in CS IOCTL error path
- flexible array updates
- enable new GC/PSP/SMU/NBIO IP
- GFX preemption support for gfx9
amdkfd:
- cache size fixes
- userptr fixes
- enable cooperative launch on gfx 10.3
- enable GC 11.0.4 KFD support
radeon:
- replace kmap with kmap_local_page
- ACPI ref count fix
- HDA audio notifier support
i915:
- DG2 enabled by default
- MTL enablement work
- hotplug refactoring
- VBT improvements
- Display and watermark refactoring
- ADL-P workaround
- temp disable runtime_pm for discrete-
- fix for A380 as a secondary GPU
- Wa_18017747507 for DG2
- CS timestamp support fixes for gen5 and earlier
- never purge busy TTM objects
- use i915_sg_dma_sizes for all backends
- demote GuC kernel contexts to normal priority
- gvt: refactor for new MDEV interface
- enable DC power states on eDP ports
- fix gen 2/3 workarounds
nouveau:
- fix page fault handling
- Ampere acceleration support
- driver stability improvements
- nva3 backlight support
msm:
- MSM_INFO_GET_FLAGS support
- DPU: XR30 and P010 image formats
- Qualcomm SM6115 support
- DSI PHY support for QCM2290
- HDMI: refactored dev init path
- remove exclusive-fence hack
- fix speed-bin detection
- enable clamp to idle on 7c3
- improved hangcheck detection
vmwgfx:
- fb and cursor refactoring
- convert to generic hashtable
- cursor improvements
etnaviv:
- hw workarounds
- softpin MMU fixes
ast:
- atomic gamma LUT support
- convert to SHMEM
lcdif:
- support YUV planes
- Increase DMA burst size
- FIFO threshold tuning
meson:
- fix return type of cvbs mode_valid
mgag200:
- fix PLL setup on some revisions
sun4i:
- A100 and D1 support
udl:
- modesetting improvements
- hot unplug support
vc4:
- support PAL-M
- fix regression preventing 4K @ 60Hz
- fix NULL ptr deref
v3d:
- switch to drm managed resources
renesas:
- RZ/G2L DSI support
- DU Kconfig cleanup
mediatek:
- fixup dpi and hdmi
- MT8188 dpi support
- MT8195 AFBC support
tegra:
- NVDEC hardware on Tegra234 SoC
hdlcd:
- switch to drm managed resources
ingenic:
- fix registration error path
hisilicon:
- convert to drm_mode_init
maildp:
- use managed resources
mtk:
- use drm_mode_init
rockchip:
- use drm_mode_copy"
* tag 'drm-next-2022-12-13' of git://anongit.freedesktop.org/drm/drm: (1397 commits)
drm/amdgpu: fix mmhub register base coding error
drm/amdgpu: add tmz support for GC IP v11.0.4
drm/amdgpu: enable GFX Clock Gating control for GC IP v11.0.4
drm/amdgpu: enable GFX Power Gating for GC IP v11.0.4
drm/amdgpu: enable GFX IP v11.0.4 CG support
drm/amdgpu: Make amdgpu_ring_mux functions as static
drm/amdgpu: generally allow over-commit during BO allocation
drm/amd/display: fix array index out of bound error in DCN32 DML
drm/amd/display: 3.2.215
drm/amd/display: set optimized required for comp buf changes
drm/amd/display: Add debug option to skip PSR CRTC disable
drm/amd/display: correct DML calc error of UrgentLatency
drm/amd/display: correct static_screen_event_mask
drm/amd/display: Ensure commit_streams returns the DC return code
drm/amd/display: read invalid ddc pin status cause engine busy
drm/amd/display: Bypass DET swath fill check for max clocks
drm/amd/display: Disable uclk pstate for subvp pipes
drm/amd/display: Fix DCN2.1 default DSC clocks
drm/amd/display: Enable dp_hdmi21_pcon support
drm/amd/display: prevent seamless boot on displays that don't have the preferred dig
...
|
|
Starting with MTL, there will be two GT-tiles, a render and media
tile. PXP as a service for supporting workloads with protected
contexts and protected buffers can be subscribed by process
workloads on any tile. However, depending on the platform,
only one of the tiles is used for control events pertaining to PXP
operation (such as creating the arbitration session and session
tear-down).
PXP as a global feature is accessible via batch buffer instructions
on any engine/tile and the coherency across tiles is handled implicitly
by the HW. In fact, for the foreseeable future, we are expecting this
single-control-tile for the PXP subsystem.
In MTL, it's the standalone media tile (not the root tile) because
it contains the VDBOX and KCR engine (among the assets PXP relies on
for those events).
Looking at the current code design, each tile is represented by the
intel_gt structure while the intel_pxp structure currently hangs off the
intel_gt structure.
Keeping the intel_pxp structure within the intel_gt structure makes some
internal functionalities more straight forward but adds code complexity to
code readability and maintainibility to many external-to-pxp subsystems
which may need to pick the correct intel_gt structure. An example of this
would be the intel_pxp_is_active or intel_pxp_is_enabled functionality
which should be viewed as a global level inquiry, not a per-gt inquiry.
That said, this series promotes the intel_pxp structure into the
drm_i915_private structure making it a top-level subsystem and the PXP
subsystem will select the control gt internally and keep a pointer to
it for internal reference.
This promotion comes with two noteworthy changes:
1. Exported pxp functions that are called by external subsystems
(such as intel_pxp_enabled/active) will have to check implicitly
if i915->pxp is valid as that structure will not be allocated
for HW that doesn't support PXP.
2. Since GT is now considered a soft-dependency of PXP we are
ensuring that GT init happens before PXP init and vice versa
for fini. This causes a minor ordering change whereby we previously
called intel_pxp_suspend after intel_uc_suspend but now is before
i915_gem_suspend_late but the change is required for correct
dependency flows. Additionally, this re-order change doesn't
have any impact because at that point in either case, the top level
entry to i915 won't observe any PXP events (since the GPU was
quiesced during suspend_prepare). Also, any PXP event doesn't
really matter when we disable the PXP HW (global GT irqs are
already off anyway, so even if there was a bug that generated
spurious events we wouldn't see it and we would just clean it
up on resume which is okay since the default fallback action
for PXP would be to keep the sessions off at this suspend stage).
Changes from prior revs:
v11: - Reformat a comment (Tvrtko).
v10: - Change the code flow for intel_pxp_init to make it more
cleaner and readible with better comments explaining the
difference between full-PXP-feature vs the partial-teelink
inits depending on the platform. Additionally, only do
the pxp allocation when we are certain the subsystem is
needed. (Tvrtko).
v9: - Cosmetic cleanups in supported/enabled/active. (Daniele).
- Add comments for intel_pxp_init and pxp_get_ctrl_gt that
explain the functional flow for when PXP is not supported
but the backend-assets are needed for HuC authentication
(Daniele and Tvrtko).
- Fix two remaining functions that are accessible outside
PXP that need to be checking pxp ptrs before using them:
intel_pxp_irq_handler and intel_pxp_huc_load_and_auth
(Tvrtko and Daniele).
- User helper macro in pxp-debugfs (Tvrtko).
v8: - Remove pxp_to_gt macro (Daniele).
- Fix a bug in pxp_get_ctrl_gt for the case of MTL and we don't
support GSC-FW on it. (Daniele).
- Leave i915->pxp as NULL if we dont support PXP and in line
with that, do additional validity check on i915->pxp for
intel_pxp_is_supported/enabled/active (Daniele).
- Remove unncessary include header from intel_gt_debugfs.c
and check drm_minor i915->drm.primary (Daniele).
- Other cosmetics / minor issues / more comments on suspend
flow order change (Daniele).
v7: - Drop i915_dev_to_pxp and in intel_pxp_init use 'i915->pxp'
through out instead of local variable newpxp. (Rodrigo)
- In the case intel_pxp_fini is called during driver unload but
after i915 loading failed without pxp being allocated, check
i915->pxp before referencing it. (Alan)
v6: - Remove HAS_PXP macro and replace it with intel_pxp_is_supported
because : [1] introduction of 'ctrl_gt' means we correct this
for MTL's upcoming series now. [2] Also, this has little impact
globally as its only used by PXP-internal callers at the moment.
- Change intel_pxp_init/fini to take in i915 as its input to avoid
ptr-to-ptr in init/fini calls.(Jani).
- Remove the backpointer from pxp->i915 since we can use
pxp->ctrl_gt->i915 if we need it. (Rodrigo).
v5: - Switch from series to single patch (Rodrigo).
- change function name from pxp_get_kcr_owner_gt to
pxp_get_ctrl_gt.
- Fix CI BAT failure by removing redundant call to intel_pxp_fini
from driver-remove.
- NOTE: remaining open still persists on using ptr-to-ptr
and back-ptr.
v4: - Instead of maintaining intel_pxp as an intel_gt structure member
and creating a number of convoluted helpers that takes in i915 as
input and redirects to the correct intel_gt or takes any intel_gt
and internally replaces with the correct intel_gt, promote it to
be a top-level i915 structure.
v3: - Rename gt level helper functions to "intel_pxp_is_enabled/
supported/ active_on_gt" (Daniele)
- Upgrade _gt_supports_pxp to replace what was intel_gtpxp_is
supported as the new intel_pxp_is_supported_on_gt to check for
PXP feature support vs the tee support for huc authentication.
Fix pxp-debugfs-registration to use only the former to decide
support. (Daniele)
- Couple minor optimizations.
v2: - Avoid introduction of new device info or gt variables and use
existing checks / macros to differentiate the correct GT->PXP
control ownership (Daniele Ceraolo Spurio)
- Don't reuse the updated global-checkers for per-GT callers (such
as other files within PXP) to avoid unnecessary GT-reparsing,
expose a replacement helper like the prior ones. (Daniele).
v1: - Add one more patch to the series for the intel_pxp suspend/resume
for similar refactoring
References: https://patchwork.freedesktop.org/patch/msgid/20221202011407.4068371-1-alan.previn.teres.alexis@intel.com
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221208180542.998148-1-alan.previn.teres.alexis@intel.com
|
|
Remove rmw_set(), rmw_clear(), clear_register(), rmw_set_fw(), and
rmw_clear_fw(). They're just one too many levels of abstraction for
register access, for very specific purposes.
clear_register() seems like a micro-optimization bypassing the write
when the register is already clear, but that trick has ceased to work
since commit 06b975d58fd6 ("drm/i915: make intel_uncore_rmw() write
unconditionally"). Just clear the register in the most obvious way.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221123164916.4128733-1-jani.nikula@intel.com
|
|
Pull drm fixes from Dave Airlie:
"Things do seem to have finally settled down, just four i915 and one
amdgpu this week. Probably won't have much for next week if you do
push rc8 out.
i915:
- Fix dram info readout
- Remove non-existent pipes from bigjoiner pipe mask
- Fix negative value passed as remaining time
- Never return 0 if not all requests retired
amdgpu:
- VCN fix for vangogh"
* tag 'drm-fixes-2022-12-02' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: enable Vangogh VCN indirect sram mode
drm/i915: Never return 0 if not all requests retired
drm/i915: Fix negative value passed as remaining time
drm/i915: Remove non-existent pipes from bigjoiner pipe mask
drm/i915/mtl: Fix dram info readout
|
|
We've been overloading uncore->lock to protect access to the MCR
steering register. That's not really what uncore->lock is intended for,
and it would be better if we didn't need to hold such a high-traffic
spinlock for the whole sequence of (apply steering, access MCR register,
restore steering). Let's create a dedicated MCR lock to protect the
steering control register over this critical section and stop relying on
the high-traffic uncore->lock.
For now the new lock is a software lock. However some platforms (MTL
and beyond) have a hardware-provided locking mechanism that can be used
to serialize not only software accesses, but also hardware/firmware
accesses as well; support for that hardware level lock will be added in
a future patch.
v2:
- Use irqsave/irqrestore spinlock calls; platforms using execlist
submission rather than GuC submission can perform MCR accesses in
interrupt context because reset -> errordump happens in a tasklet.
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221128233014.4000136-4-matthew.d.roper@intel.com
|
|
In case of Gen12 video and compute engines, TLB_INV registers are masked -
to modify one bit, corresponding bit in upper half of the register must
be enabled, otherwise nothing happens.
CVE: CVE-2022-4139
Suggested-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
with GuC") extended the API of intel_gt_retire_requests_timeout() with an
extra argument 'remaining_timeout', intended for passing back unconsumed
portion of requested timeout when 0 (success) is returned. However, when
request retirement happens to succeed despite an error returned by a call
to dma_fence_wait_timeout(), that error code (a negative value) is passed
back instead of remaining time. If we then pass that negative value
forward as requested timeout to intel_uc_wait_for_idle(), an explicit BUG
will be triggered.
If request retirement succeeds but an error code is passed back via
remaininig_timeout, we may have no clue on how much of the initial timeout
might have been left for spending it on waiting for GuC to become idle.
OTOH, since all pending requests have been successfully retired, that
error code has been already ignored by intel_gt_retire_requests_timeout(),
then we shouldn't fail.
Assume no more time has been left on error and pass 0 timeout value to
intel_uc_wait_for_idle() to give it a chance to return success if GuC is
already idle.
v3: Don't fail on any error passed back via remaining_timeout.
v2: Fix the issue on the caller side, not the provider.
Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.15+
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221121145655.75141-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit f235dbd5b768e238d365fd05d92de5a32abc1c1f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
with GuC") extended the API of intel_gt_retire_requests_timeout() with an
extra argument 'remaining_timeout', intended for passing back unconsumed
portion of requested timeout when 0 (success) is returned. However, when
request retirement happens to succeed despite an error returned by a call
to dma_fence_wait_timeout(), that error code (a negative value) is passed
back instead of remaining time. If we then pass that negative value
forward as requested timeout to intel_uc_wait_for_idle(), an explicit BUG
will be triggered.
If request retirement succeeds but an error code is passed back via
remaininig_timeout, we may have no clue on how much of the initial timeout
might have been left for spending it on waiting for GuC to become idle.
OTOH, since all pending requests have been successfully retired, that
error code has been already ignored by intel_gt_retire_requests_timeout(),
then we shouldn't fail.
Assume no more time has been left on error and pass 0 timeout value to
intel_uc_wait_for_idle() to give it a chance to return success if GuC is
already idle.
v3: Don't fail on any error passed back via remaining_timeout.
v2: Fix the issue on the caller side, not the provider.
Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.15+
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221121145655.75141-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit f235dbd5b768e238d365fd05d92de5a32abc1c1f)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
On XE_LPM+ platforms the media engines are carved out into a separate
GT but have a common GGTMMADR address range which essentially makes
the GGTT address space to be shared between media and render GT. As a
result any updates in GGTT shall invalidate TLB of GTs sharing it and
similarly any operation on GGTT requiring an action on a GT will have to
involve all GTs sharing it. setup_private_pat was being done on a per
GGTT based as that doesn't touch any GGTT structures moved it to per GT
based.
BSPEC: 63834
v2:
1. Add details to commit msg
2. includes fix for failure to add item to ggtt->gt_list, as suggested
by Lucas
3. as ggtt_flush() is used only for ggtt drop i915_is_ggtt check within
it.
4. setup_private_pat moved out of intel_gt_tiles_init
v3:
1. Move out for_each_gt from i915_driver.c (Jani Nikula)
v4: drop using RCU primitives on ggtt->gt_list as it is not an RCU list
(Matt Roper)
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221122070126.4813-1-aravind.iddamsetty@intel.com
|
|
Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
with GuC") extended the API of intel_gt_retire_requests_timeout() with an
extra argument 'remaining_timeout', intended for passing back unconsumed
portion of requested timeout when 0 (success) is returned. However, when
request retirement happens to succeed despite an error returned by a call
to dma_fence_wait_timeout(), that error code (a negative value) is passed
back instead of remaining time. If we then pass that negative value
forward as requested timeout to intel_uc_wait_for_idle(), an explicit BUG
will be triggered.
If request retirement succeeds but an error code is passed back via
remaininig_timeout, we may have no clue on how much of the initial timeout
might have been left for spending it on waiting for GuC to become idle.
OTOH, since all pending requests have been successfully retired, that
error code has been already ignored by intel_gt_retire_requests_timeout(),
then we shouldn't fail.
Assume no more time has been left on error and pass 0 timeout value to
intel_uc_wait_for_idle() to give it a chance to return success if GuC is
already idle.
v3: Don't fail on any error passed back via remaining_timeout.
v2: Fix the issue on the caller side, not the provider.
Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.15+
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221121145655.75141-2-janusz.krzysztofik@linux.intel.com
|
|
The GT MCR code currently relies on uncore->lock to avoid race
conditions on the steering control register during MCR operations. The
*_fw() versions of MCR operations expect the caller to already hold
uncore->lock, while the non-fw variants manage the lock internally.
However the sole callsite of intel_gt_mcr_wait_for_reg_fw() does not
currently obtain the forcewake lock, allowing a potential race condition
(and triggering an assertion on lockdep builds). Furthermore, since
'wait for register value' requests may not return immediately, it is
undesirable to hold a fundamental lock like uncore->lock for the entire
wait and block all other MMIO for the duration; rather the lock is only
needed around the MCR read operations and can be released during the
delays.
Convert intel_gt_mcr_wait_for_reg_fw() to a non-fw variant that will
manage uncore->lock internally. This does have the side effect of
causing an unnecessary lookup in the forcewake table on each read
operation, but since the caller is still holding the relevant forcewake
domain, this will ultimately just incremenent the reference count and
won't actually cause any additional MMIO traffic.
In the future we plan to switch to a dedicated MCR lock to protect the
steering critical section rather than using the overloaded and
high-traffic uncore->lock; on MTL and beyond the new lock can be
implemented on top of the hardware-provided synchonization mechanism for
steering.
Fixes: 3068bec83eea ("drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw()")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221117173358.1980230-1-matthew.d.roper@intel.com
(cherry picked from commit 192bb40f030a41ca95c5cff8c9340b725bc7ba8b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
GVT Changes:
- gvt-next stuff mostly with refactor for the new MDEV interface.
i915 Changes:
- PSR fixes and improvements (Jouni)
- DP DSC fixes (Vinod, Jouni)
- More general display cleanups (Jani)
- More display collor management cleanup targetting degamma (Ville)
- remove circ_buf.h includes (Jiri)
- wait power off delay at driver remove to optimize probe (Jani)
- More audio cleanup targeting the ELD precompute readout (Ville)
- Enable DC power states on all eDP ports (Imre)
- RPL-P stepping info (Matt Atwood)
- MTL enabling patches (RK)
- Removal of DG2 force_probe (Matt)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y3f71obyEkImXoUF@intel.com
|
|
The GT MCR code currently relies on uncore->lock to avoid race
conditions on the steering control register during MCR operations. The
*_fw() versions of MCR operations expect the caller to already hold
uncore->lock, while the non-fw variants manage the lock internally.
However the sole callsite of intel_gt_mcr_wait_for_reg_fw() does not
currently obtain the forcewake lock, allowing a potential race condition
(and triggering an assertion on lockdep builds). Furthermore, since
'wait for register value' requests may not return immediately, it is
undesirable to hold a fundamental lock like uncore->lock for the entire
wait and block all other MMIO for the duration; rather the lock is only
needed around the MCR read operations and can be released during the
delays.
Convert intel_gt_mcr_wait_for_reg_fw() to a non-fw variant that will
manage uncore->lock internally. This does have the side effect of
causing an unnecessary lookup in the forcewake table on each read
operation, but since the caller is still holding the relevant forcewake
domain, this will ultimately just incremenent the reference count and
won't actually cause any additional MMIO traffic.
In the future we plan to switch to a dedicated MCR lock to protect the
steering critical section rather than using the overloaded and
high-traffic uncore->lock; on MTL and beyond the new lock can be
implemented on top of the hardware-provided synchonization mechanism for
steering.
Fixes: 3068bec83eea ("drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw()")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221117173358.1980230-1-matthew.d.roper@intel.com
|
|
Catch up on 6.1-rc cycle in order to solve the intel_backlight
conflict on linux-next.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
With MTL standalone media architecture the wopcm layout has changed,
with separate partitioning in WOPCM for the root GT GuC and the media
GT GuC. The size of WOPCM is 4MB with the lower 2MB reserved for the
media GT and the upper 2MB for the root GT.
Given that MTL has GuC deprivilege, the WOPCM registers are pre-locked
by the bios. Therefore, we can skip all the math for the partitioning
and just limit ourselves to sanity-checking the values.
v2: fix makefile file ordering (Jani)
v3: drop XELPM_SAMEDIA_WOPCM_SIZE, check huc instead of VDBOX (John)
v4: further clarify commit message, remove blank line (John)
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-5-daniele.ceraolospurio@intel.com
|
|
Turns out many of the files that need i915_reg.h get it implicitly via
{display/intel_de.h, gt/intel_context.h} -> i915_trace.h -> i915_irq.h
-> i915_reg.h. Since i915_trace.h doesn't actually need i915_irq.h,
makes sense to drop it, but that requires adding quite a few new
includes all over the place.
Prefer including i915_reg.h where needed instead of adding another
implicit include, because eventually we'll want to split up i915_reg.h
and only include the specific registers at each place.
Also some places actually needed i915_irq.h too.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/6e78a2e0ac1bffaf5af3b5ccc21dff05e6518cef.1668008071.git.jani.nikula@intel.com
|
|
Convert some usages of legacy DRM logging macros into versions which tell
us on which device have the events occurred.
v2:
* Don't have struct drm_device as local. (Jani, Ville)
v3:
* Store gt, not i915, in workaround list. (John)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221109104633.2579245-1-tvrtko.ursulin@linux.intel.com
|
|
Runtime pm is not really per GT, therefore it make sense to
move lmem_userfault_list, lmem_userfault_lock and
userfault_wakeref from intel_gt to intel_runtime_pm structure,
which is embedded to i915.
No functional change.
v2:
- Fixes the code comment nit. [Matt Auld]
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221027092242.1476080-2-anshuman.gupta@intel.com
|
|
We have the same code to determine the MMIO BAR in
two places. Collect it to a single place.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221005154159.18750-1-ville.syrjala@linux.intel.com
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Rather than treating multicast registers as 'i915_reg_t' let's define
them as a completely new type. This will allow the compiler to help us
make sure we're using multicast-aware functions to operate on multicast
registers.
This plan does break down a bit in places where we're just maintaining
heterogeneous lists of registers (e.g., various MMIO whitelists used by
perf, GVT, etc.) rather than performing reads/writes. We only really
care about the offset in those cases, so for now we can "cast" the
registers as non-MCR, leaving us with a list of i915_reg_t's, but we may
want to look for better ways to store mixed collections of i915_reg_t
and i915_mcr_reg_t in the future.
v2:
- Add TLB invalidation registers
v3:
- Make type checking of i915_mmio_reg_offset() stricter. It will
accept either i915_reg_t or i915_mcr_reg_t, but will now raise a
compile error if any other type is passed, even if that type contains
a 'reg' field. (Jani)
- Drop a ton of GVT changes; allowing i915_mmio_reg_offset() to take
either an i915_reg_t or an i915_mcr_reg_t means that the huge lists
of MMIO_D*() macros used in GVT will continue to work without
modification. We need only make changes to structures that have an
explicit i915_reg_t in them now.
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-13-matthew.d.roper@intel.com
|
|
Rather than relying on the implicit behavior of intel_uncore_*()
functions, let's always use the intel_gt_mcr_*() functions to operate on
multicast/replicated registers.
v2:
- Add TLB invalidation registers
v3:
- Switch more uncore operations in mmio_invalidate_full() to MCR
operations for Xe_HP. (Bala)
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-10-matthew.d.roper@intel.com
|
|
On Xe_HP the fault registers are now in a multicast register range.
However as part of the GAM these registers follow special rules and we
need only read from the "primary" GAM's instance to get the information
we need. So a single intel_gt_mcr_read_any() (which will automatically
steer to the primary GAM) is sufficient; we don't need to loop over each
instance of the MCR register.
v2:
- Update more instances of fault registers. (Bala)
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-7-matthew.d.roper@intel.com
|
|
Starting in Xe_HP, several registers our driver works with have been
converted from singleton registers into replicated registers with
multicast behavior. Although the registers are still located at the
same MMIO offsets as on previous platforms, let's duplicate the register
definitions in preparation for upcoming patches that will handle
multicast registers in a special manner.
The registers that are now replicated on Xe_HP are:
* PAT_INDEX (mslice replication)
* FF_MODE2 (gslice replication)
* COMMON_SLICE_CHICKEN3 (gslice replication)
* SLICE_COMMON_ECO_CHICKEN1 (gslice replication)
* SLICE_UNIT_LEVEL_CLKGATE (gslice replication)
* LNCFCMOCS (lncf replication)
Note that there are a couple places in selftest_mocs.c where the
gen9 version of LNCFCMOCS is still used without regards for which
platform we're on. Those cases are just doing an offset lookup and not
issuing any CPU reads/writes of the register, so the potentially
multicast nature of the register doesn't come into play.
v2:
- Add commit message note about the unconditional GEN9_LNCFCMOCS usage
in selftest_mocs. (Bala)
- Include some additional TLB registers.
Bspec: 66534
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-3-matthew.d.roper@intel.com
|
|
Daniele needs 84d4333c1e28 ("misc/mei: Add NULL check to component match
callback functions") in order to merge the DG2 HuC patches.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
Cross-subsystem Changes:
- MEI subsystem pieces for XeHP SDV GSC support
These are Acked-by Greg.
Driver Changes:
- Release mmaps on RPM suspend on discrete GPUs (Anshuman)
- Update GuC version to 7.5 on DG1, DG2 and ADL
- Revert "drm/i915/dg2: extend Wa_1409120013 to DG2" (Lucas)
- MTL enabling incl. standalone media (Matt R, Lucas)
- Explicitly clear BB_OFFSET for new contexts on Gen8+ (Chris)
- Fix throttling / perf limit reason decoding (Ashutosh)
- XeHP SDV GSC support (Vitaly, Alexander, Tomas)
- Fix issues with overrding firmware file paths (John)
- Invert if-else ladders to check latest version first (Lucas)
- Cancel GuC engine busyness worker synchronously (Umesh)
- Skip applying copy engine fuses outside PVC (Lucas)
- Eliminate Gen10 frequency read function (Lucas)
- Static code checker fixes (Gaosheng)
- Selftest improvements (Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YyQ4Jgl3cpGL1/As@jlahtine-mobl.ger.corp.intel.com
|
|
Register GT0_PERF_LIMIT_REASONS (0x1381a8) is available only for
Gen11+. Therefore ensure perf_limit_reasons sysfs/debugfs files are created
only for Gen11+. Otherwise on Gen < 5 accessing these files results in the
following oops:
<1> [88.829420] BUG: unable to handle page fault for address: ffffc90000bb81a8
<1> [88.829438] #PF: supervisor read access in kernel mode
<1> [88.829447] #PF: error_code(0x0000) - not-present page
Bspec: 20008
Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6863
Fixes: fe5979665f64 ("drm/i915/debugfs: Add perf_limit_reasons in debugfs")
Fixes: fa68bff7cf27 ("drm/i915/gt: Add sysfs throttle frequency interfaces")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220919162401.2077713-1-ashutosh.dixit@intel.com
|
|
PERF_LIMIT_REASONS register for MTL media gt is different now.
v2: Avoid static inline for intel_gt_perf_limit_reasons_reg() (Jani)
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220910143844.1755324-3-ashutosh.dixit@intel.com
|
|
Release all mmap mapping for all lmem objects which are associated
with userfault such that, while pcie function in D3hot, any access
to memory mappings will raise a userfault.
Runtime resume the dgpu(when gem object lies in lmem).
This will transition the dgpu graphics function to D0
state if it was in D3 in order to access the mmap memory
mappings.
v2:
- Squashes the patches. [Matt Auld]
- Add adequate locking for lmem_userfault_list addition. [Matt Auld]
- Reused obj->userfault_count to avoid double addition. [Matt Auld]
- Added i915_gem_object_lock to check
i915_gem_object_is_lmem. [Matt Auld]
v3:
- Use i915_ttm_cpu_maps_iomem. [Matt Auld]
- Fix 'ret == 0 to ret == VM_FAULT_NOPAGE'. [Matt Auld]
- Reuse obj->userfault_count as a bool 0 or 1. [Matt Auld]
- Delete the mmaped obj from lmem_userfault_list in obj
destruction path. [Matt Auld]
- Get a wakeref for object destruction patch. [Matt Auld]
- Use intel_wakeref_auto to delay runtime PM. [Matt Auld]
v4:
- Avoid using mmo offset to get the vma_node. [Matt Auld]
- Added comment to use the lmem_userfault_lock. [Matt Auld]
- Get lmem_userfault_lock in i915_gem_object_release_mmap_offset.
[Matt Auld]
- Fixed kernel test robot generated warning.
v5:
- Addressed the cosmetics comments. [Andi]
- Changed i915_gem_runtime_pm_object_release_mmap_offset() name to
i915_gem_object_runtime_pm_release_mmap_offset() to be rhythmic.
PCIe Specs 5.3.1.4.1
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6331
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220913152714.16541-3-anshuman.gupta@intel.com
|
|
Refactor userfault_wakeref to re-use for discrete lmem mmap mapping
as well, as on discrete GTT mmap are not supported. Moving
userfault_wakeref from ggtt to gt structure.
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220913152714.16541-2-anshuman.gupta@intel.com
|
|
When we hook up interrupts (in the next patch), interrupts for the media
GT are still processed as part of the primary GT's interrupt flow. As
such, we should share the same IRQ lock with the primary GT. Let's
convert gt->irq_lock into a pointer and just point the media GT's
instance at the same lock the primary GT is using.
v2:
- Point media's gt->irq_lock at the primary GT lock properly. (Daniele)
- Fix jump target for intel_root_gt_init_early errors. (Daniele)
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-14-matthew.d.roper@intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Xe_LPM+ platforms have "standalone media." I.e., the media unit is
designed as an additional GT with its own engine list, GuC, forcewake,
etc. Let's allow platforms to include media GTs in their device info.
v2:
- Simplify GSI register handling and split it out to a separate patch
for ease of review. (Daniele)
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Acked-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-13-matthew.d.roper@intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
The common early GT init is needed for initialization of all GT types
(root/primary, remote tile, standalone media). Since standalone media
(coming in a future patch) will be implemented in a separate file,
rename and expose the function for use.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-7-matthew.d.roper@intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
We're going to introduce an additional intel_gt for MTL's media unit
soon. Let's provide a bit more multi-GT initialization framework in
preparation for that. The initialization will pull the list of GTs for
a platform from the device info structure. Although necessary for the
immediate MTL media enabling, this same framework will also be used
farther down the road when we enable remote tiles on xehpsdv and pvc.
v2:
- Re-add missing test for !HAS_EXTRA_GT_LIST in intel_gt_probe_all().
v3:
- Move intel_gt_definition struct to intel_gt_types.h. (Jani)
- Drop gtdef->setup(). For now we'll just use a switch() based on GT
type since we don't have too many different handlers for the
foreseeable future. (Jani)
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-6-matthew.d.roper@intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Unmapping of the MMIO range can be done as a DRM-managed action, which
will take care of the unmapping on device teardown and error paths.
This will also ensure proper ordering with respect to other DRM-managed
actions that we'll be using to clean up non-primary GTs in upcoming
patches.
We have not yet enabled any non-root GTs in the driver yet, so the
kfree() of the GT structure is effectively dead code. When we do start
enabling non-root GTs in upcoming patches, those are going to be using
DRM-managed allocations tied to the device lifetime, so we don't need to
explicitly free them (and kfree would be incorrect anyway).
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-5-matthew.d.roper@intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
We're slowly transitioning the init-time kzalloc's of the driver over to
DRM-managed allocations; let's make sure the uncore objects allocated
for non-root GTs are thus allocated.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-4-matthew.d.roper@intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
The original intent of intel_uncore_mmio_debug as described in commit
0a9b26306d6a ("drm/i915: split out uncore_mmio_debug") was to be a
singleton structure that could be shared between multiple GTs' uncore
objects in a multi-tile system. Somehow we went off track and
started allocating separate instances of this structure for each GT,
which defeats that original goal.
But in reality, there isn't even a need to share the mmio_debug between
multiple GTs; on all modern platforms (i.e., everything after gen7)
unclaimed register accesses are something that can only be detected for
display registers. There's no point in grabbing the debug spinlock and
checking for unclaimed accesses on an uncore used by an xehpsdv or pvc
remote tile GT, or the uncore used by a mtl standalone media GT since
all of the display accesses go through the primary intel_uncore.
The simplest solution is to simply leave uncore->debug NULL on all
intel_uncore instances except for the primary one. This will allow us
to avoid the pointless debug spinlock acquisition we've been doing on
MMIO accesses coming in through these intel_uncores.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-3-matthew.d.roper@intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Sync drm-intel-next with v6.0-rc as well as recent drm-intel-gt-next.
Since drm-next does not have commit f0c70d41e4e8 ("drm/i915/guc: remove
runtime info printing from time stamp logging") yet, only
drm-intel-gt-next, will need to do that as part of the merge here to
build.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
At the moment, when we refer to some PCI BAR we use the number of
this BAR in the code. The meaning of BARs between different platforms
may be different. Therefore, in order to organize the code,
let's start using defined names instead of numbers.
v2: Add lost header in cfg_space.c
Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220805155959.1983584-2-piotr.piorkowski@intel.com
|
|
Invalidate TLB in batches, in order to reduce performance regressions.
Currently, every caller performs a full barrier around a TLB
invalidation, ignoring all other invalidations that may have already
removed their PTEs from the cache. As this is a synchronous operation
and can be quite slow, we cause multiple threads to contend on the TLB
invalidate mutex blocking userspace.
We only need to invalidate the TLB once after replacing our PTE to
ensure that there is no possible continued access to the physical
address before releasing our pages. By tracking a seqno for each full
TLB invalidate we can quickly determine if one has been performed since
rewriting the PTE, and only if necessary trigger one for ourselves.
That helps to reduce the performance regression introduced by TLB
invalidate logic.
[mchehab: rebased to not require moving the code to a separate file]
Cc: stable@vger.kernel.org
Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Cc: Fei Yang <fei.yang@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4e97ef5deb6739cadaaf40aa45620547e9c4ec06.1658924372.git.mchehab@kernel.org
(cherry picked from commit 5d36acb7198b0e5eb88e6b701f9ad7b9448f8df9)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|