Age | Commit message (Collapse) | Author |
|
The Dynamic Inhibit Context Switch is an optimization aimed at reducing
the amount of time the HW is stuck waiting on an unsatisfied semaphore.
When this optimization is enabled, the GuC will dynamically modify the
CTX_CTRL_INHIBIT_SYN_CTX_SWITCH in the CTX_CONTEXT_CONTROL register of
LRCs to enable immediate switching out on an unsatisfied semaphore wait
when multiple contexts are competing for time on the same engine.
This feature is available on recent HW from GuC 70.40.1 onwards and it
is enabled via a per-VF feature opt-in.
v2: rebase
v3: switch to using guc_buf_cache instead of dedicated alloc
v4: add helper to check for feature availability (Michal), don't enable
if multi-lrc is possible.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Julia Filipchuk <julia.filipchuk@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://lore.kernel.org/r/20250625205405.1653212-4-daniele.ceraolospurio@intel.com
|
|
On newer HW (Xe2 onwards + PVC) it is possible to get extra information
when a CAT error occurs, specifically a dword reporting the error type.
To enable this extra reporting, we need to opt-in with the GuC, which is
done via a specific per-VF feature opt-in H2G.
On platforms where the HW does not support the extra reporting, the GuC
will set the type to 0xdeadbeef, so we can keep the code simple and
opt-in to the feature on every platform and then just discard the data
if it is invalid.
Note that on native/PF we're guaranteed that the opt in is available
because we don't support any GuC old enough to not have it, but if we're
a VF we might be running on a non-XE PF with an older GuC, so we need to
handle that case. We can re-use the invalid type above to handle this
scenario the same way as if the feature was not supported in HW.
Given that this patch is the first user of the guc_buf_cache on native
and VF, it also extends that feature to non-PF use-cases.
v2: simpler print for the error type (John), rebase
v3: use guc_buf_cache instead of new alloc, simpler doc (Michal)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> #v1
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://lore.kernel.org/r/20250625205405.1653212-3-daniele.ceraolospurio@intel.com
|
|
This reverts commit 3972872e459d812ab5e481a231a6066cf4f4d0f4.
There are several things wrong with the way this WA was implemented:
- The KLV is only supported on GuC 70.47.0 or newer, so we shouldn't
apply it unconditionally.
- The KLV requires 2 DWs of data, which are not currently provided.
The GuC currently ignores any unknown KLVs, so on versions older that
70.47.0 nothing happens. However, starting on 70.47.0 the GuC attempts
to parse the KLV and fails due to the missing data, causing a GuC load
abort.
Given that 70.47.0 is the first GuC version approved for public release
for PTL, let's revert this patch so it doesn't cause the GuC load to
fail with that blob. We can then re-apply it properly fixed after the
GuC definition is merged, which will also have the added benefit of
running the KLV addition through CI with the right GuC version.
Fixes: 3972872e459d ("drm/xe/ptl: Apply Wa_16026007364")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: sanirban <sk.anirban@intel.com>
Cc: Badal Nilawar <badal.nilawar@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250625001202.1616606-2-daniele.ceraolospurio@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
As part of this WA GuC will save and restore value of two XE3_Media
control registers that were not included in the HW power context.
v2:
- Update klv name (Badal)
Signed-off-by: sanirban <sk.anirban@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://lore.kernel.org/r/20250619133413.107423-2-sk.anirban@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
These error codes are not actually used in the driver but it is
extremely useful to have them available to understand error messages.
v2: Add a bunch more error codes and drop 'status' from names (review
feedback by Michal W).
v3: Drop 'SUCCESS' response as meaningless in current API (review
feedback by Michal W).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://lore.kernel.org/r/20250512215324.1457009-3-John.C.Harrison@Intel.com
|
|
Some GuC messages are constructed with incrementing dword counter
rather than referencing specific DWORDs, as described in GuC interface
specification.
This change introduces the definitions of DWORD numbers for parameters
which will need to be referenced in a CTB parser to be added in a
following patch. To ensure correctness of these DWORDs, verification
in form of asserts was added to the message construction code.
v2: Renamed enum members, added ones for single context registration,
modified asserts to check values rather than indexes.
v3: Reordered assert args to take less lines
v4: Added lengths
v5: Renamed MULTI_LRC_MSG_LEN to MULTI_LRC_MSG_MIN_LEN
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://lore.kernel.org/r/20250512114018.361843-4-tomasz.lis@intel.com
|
|
The workaround is only relevant to SRIOV but does affect all platforms.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://lore.kernel.org/r/20250403185619.1555853-2-John.C.Harrison@Intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Add support for function level engine activity stats.
Engine activity stats are enabled when VF's are enabled
v2: remove unnecessary initialization
move offset to improve code readability (Umesh)
remove global for function engine activity (Lucas)
v3: fix commit message (Michal)
v4: remove enable function parameter
fix kernel-doc (Umesh)
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250311071759.2117211-2-riana.tauro@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Allow user to provide a low latency hint. When set, KMD sends a hint
to GuC which results in special handling for that process. SLPC will
ramp the GT frequency aggressively every time it switches to this
process.
We need to enable the use of SLPC Compute strategy during init, but
it will apply only to processes that set this bit during process
creation.
Improvement with this approach as below:
Before,
:~$ NEOReadDebugKeys=1 EnableDirectSubmission=0 clpeak --kernel-latency
Platform: Intel(R) OpenCL Graphics
Device: Intel(R) Graphics [0xe20b]
Driver version : 24.52.0 (Linux x64)
Compute units : 160
Clock frequency : 2850 MHz
Kernel launch latency : 283.16 us
After,
:~$ NEOReadDebugKeys=1 EnableDirectSubmission=0 clpeak --kernel-latency
Platform: Intel(R) OpenCL Graphics
Device: Intel(R) Graphics [0xe20b]
Driver version : 24.52.0 (Linux x64)
Compute units : 160
Clock frequency : 2850 MHz
Kernel launch latency : 63.38 us
Compute PR: https://github.com/intel/compute-runtime/pull/794
Mesa PR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33214
IGT PR: https://patchwork.freedesktop.org/patch/639989/
V10(Lucas):
- Remove doc from drm-uapi.rst
v9(Vinay):
- remove extra line, align commit message
v8(Vinay):
- Add separate example for using low latency hint
v7(Jose):
- Update UMD PR
- applicable to all gpus
V6:
- init flags, remove redundant flags check (MAuld)
V5:
- Move uapi doc to documentation and GuC ABI specific change (Rodrigo)
- Modify logic to restrict exec queue flags (MAuld)
V4:
- To make it clear, dont use exec queue word (Vinay)
- Correct typo in description of flag (Jose/Vinay)
- rename set_strategy api and replace ctx with exec queue(Vinay)
- Start with 0th bit to indentify user flags (Jose)
V3:
- Conver user flag to kernel internal flag and use (Oak)
- Support query config for use to check kernel support (Jose)
- Dont need to take runtime pm (Vinay)
V2:
- DRM_XE_EXEC_QUEUE_LOW_LATENCY_HINT 1 planned for other hint(Szymon)
- Add motivation to description (Lucas)
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250228070224.739295-2-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
|
|
GuC provides support to read engine counters to calculate the
engine activity. KMD exposes two counters via the PMU interface to
calculate engine activity
Engine Active Ticks(engine-active-ticks) - active ticks of engine
Engine Total Ticks (engine-total-ticks) - total ticks of engine
Engine activity percentage can be calculated as below
Engine activity % = (engine active ticks/engine total ticks) * 100.
v2: fix cosmetic review comments
add forcewake for gpm_ts (Umesh)
v3: fix CI hooks error
change function parameters and unpin bo on error
of allocate_activity_buffers
fix kernel-doc (Umesh)
use engine activity (Umesh, Lucas)
rename xe_engine_activity to xe_guc_engine_*
fix commit message to use engine activity (Lucas, Umesh)
v4: add forcewake in PMU layer
v5: fix makefile
use drmm_kcalloc instead of kmalloc_array
remove managed bo
skip init for VF
fix cosmetic review comments (Michal)
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250224053903.2253539-2-riana.tauro@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
A session is initialized (i.e. started) by sending a message to the GSC.
The initialization will be triggered when a user opts-in to using PXP;
the interface for that is coming in a follow-up patch in the series.
v2: clean up error messages, use new ARB define (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250129174140.948829-7-daniele.ceraolospurio@intel.com
|
|
After a session is terminated, we need to inform the GSC so that it can
clean up its side of the allocation. This is done by sending an
invalidation command with the session ID.
The invalidation will be triggered in response to a termination,
interrupt, whose handling is coming in the next patch in the series.
v2: Better comment and error messages (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250129174140.948829-5-daniele.ceraolospurio@intel.com
|
|
PXP requires submissions to the HW for the following operations
1) Key invalidation, done via the VCS engine
2) Communication with the GSC FW for session management, done via the
GSCCS.
Key invalidation submissions are serialized (only 1 termination can be
serviced at a given time) and done via GGTT, so we can allocate a simple
BO and a kernel queue for it.
Submissions for session management are tied to a PXP client (identified
by a unique host_session_id); from the GSC POV this is a user-accessible
construct, so all related submission must be done via PPGTT. The driver
does not currently support PPGTT submission from within the kernel, so
to add this support, the following changes have been included:
- a new type of kernel-owned VM (marked as GSC), required to ensure we
don't use fault mode on the engine and to mark the different lock
usage with lockdep.
- a new function to map a BO into a VM from within the kernel.
v2: improve comments and function name, remove unneeded include (John)
v3: fix variable/function names in documentation
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250129174140.948829-3-daniele.ceraolospurio@intel.com
|
|
Fix all typos in files of xe, reported by codespell tool.
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250106102646.1400146-2-nitin.r.gote@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
|
|
Some features require inter-GuC communication channels on multi-tile
devices. So allocate and enable such.
v2: Correct use of xe_bo_get/put (review feedback from Matthew Brost)
Add extra assert, re-order a calculation for better clarity and add
comments to slot calculation (review feedback from Daniele). Also
slightly re-work the slot calc to avoid code duplication.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241120000222.204095-3-John.C.Harrison@Intel.com
|
|
This KLV allows to set the scheduling priority for each VF, also
for the PF.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lukasz Laguna <lukasz.laguna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241106151301.2079-2-michal.wajdeczko@intel.com
|
|
After restore, GuC will not answer to any messages from VF KMD until
fixups are applied. When that is done, VF KMD sends RESFIX_DONE
message to GuC, at which point GuC resumes normal operation.
This patch implements sending the RESFIX_DONE message at end of
post-migration recovery.
v2: keep pm ref during whole recovery, style fixes (Michal)
v3: assert removal to separate patch, debug message per GuC instead
of one, comments changes (Michal)
v4: improve one debug message (Michal)
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104213449.1455694-4-tomasz.lis@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
|
|
As part of this WA, GuC will hold a forcewake for certain
MMIO accesses outside the GT/media domains.
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241015234428.2004825-1-vinay.belgaumkar@intel.com
|
|
Upon the G2H Notify-Err-Capture event, parse through the
GuC Log Buffer (error-capture-subregion) and generate one or
more capture-nodes. A single node represents a single "engine-
instance-capture-dump" and contains at least 3 register lists:
global, engine-class and engine-instance. An internal link
list is maintained to store one or more nodes.
Because the link-list node generation happen before the call
to devcoredump, duplicate global and engine-class register
lists for each engine-instance register dump if we find
dependent-engine resets in a engine-capture-group.
To avoid dynamically allocate the output nodes during gt reset,
pre-allocate a fixed number of empty nodes up front (at the
time of ADS registration) that we can consume from or return to
an internal cached list of nodes.
Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241004193428.3311145-5-zhanjun.dong@intel.com
|
|
Capture-nodes generated by GuC are placed in the GuC capture ring
buffer which is a sub-region of the larger Guc-Log-buffer.
Add capture output size check before allocating the shared buffer.
Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241004193428.3311145-4-zhanjun.dong@intel.com
|
|
Add referenced registers defines and list of registers.
Update GuC ADS size allocation to include space for
the lists of error state capture register descriptors.
Then, populate GuC ADS with the lists of registers we want
GuC to report back to host on engine reset events. This list
should include global, engine-class and engine-instance
registers for every engine-class type on the current hardware.
Ensure we allocate a persistent storage for the register lists
that are populated into ADS so that we don't need to allocate
memory during GT resets when GuC is reloaded and ADS population
happens again.
Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241004193428.3311145-2-zhanjun.dong@intel.com
|
|
Add a worker function helper for asynchronously dumping state when an
internal/fatal error is detected in CT processing. Being asynchronous
is required to avoid deadlocks and scheduling-while-atomic or
process-stalled-for-too-long issues. Also check for a bunch more error
conditions and improve the handling of some existing checks.
v2: Use compile time CONFIG check for new (but not directly CT_DEAD
related) checks and use unsigned int for a bitmask, rename
CT_DEAD_RESET to CT_DEAD_REARM and add some explaining comments,
rename 'hxg' macro parameter to 'ctb' - review feedback from Michal W.
Drop CT_DEAD_ALIVE as no need for a bitfield define to just set the
entire mask to zero.
v3: Fix kerneldoc
v4: Nullify some floating pointers after free.
v5: Add section headings and device info to make the state dump look
more like a devcoredump to allow parsing by the same tools (eventual
aim is to just call the devcoredump code itself, but that currently
requires an xe_sched_job, which is not available in the CT code).
v6: Fix potential for leaking snapshots with concurrent error
conditions (review feedback from Julia F).
v7: Don't complain about unexpected G2H messages yet because there is
a known issue causing them. Fix bit shift bug with v6 change. Add GT
id to fake coredump headers and use puts instead of printf.
v8: Disable the head mis-match check in g2h_read because it is failing
on various discrete platforms due to unknown reasons.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-9-John.C.Harrison@Intel.com
|
|
In upcoming patches we will add support to the PF driver to save
and restore a VF state maintained by the GuC to allow VF migration.
Add necessary H2G definitions to our GuC firmware ABI header.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240912203817.1880-3-michal.wajdeczko@intel.com
|
|
Enable workarounds for HW bug where render engine reset fails. Given
that we're bumping the minimum required GuC version to 70.29, we're
guaranteed to always have support for this KLV in the GuC.
v2: Enable KLV correctly for either workaround (Lucas)
v4: Add check for minimum supported GuC firmware version. Enable w/a for
hw version 20.01 too. (Daniele)
v5 (Daniele): remove now unneeded fw type and version checks (JohnH)
Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240805205435.921921-1-daniele.ceraolospurio@intel.com
|
|
There are many more error codes used that the GuC firmware can
return in the RESPONSE_FAILURE message. Add to the ABI header
those which are more likely to be seen by the PF or VF drivers.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240625141258.1257-3-michal.wajdeczko@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
We have kernel-doc for all HXG message types but Fast Request.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240610120411.1768-3-michal.wajdeczko@intel.com
|
|
Those were copy-pasted from i915 code and never used in Xe driver.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240610120411.1768-2-michal.wajdeczko@intel.com
|
|
We already have a dedicated file for GuC SLPC ABI definitions.
Move definition of the SETUP_PC_GUCRC action and related enum
to that file, rename them to match format of other new ABI
definitions and add simple kernel-doc.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240609181931.1724-2-michal.wajdeczko@intel.com
|
|
VF drivers can't access GMD_ID register over MMIO.
The value of the GMD_ID register must be queried from GuC.
It is available as GLOBAL_CFG_GMD_ID KLV.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240523192240.844-3-michal.wajdeczko@intel.com
|
|
GuC loading can take longer than it is supposed to for various
reasons. So add in the code to cope with that and to report it when it
happens. There are also many different reasons why GuC loading can
fail, so add in the code for checking for those and for reporting
issues in a meaningful manner rather than just hitting a timeout and
saying 'fail: status = %x'.
Also, remove the 'FIXME' comment about an i915 bug that has never been
applicable to Xe!
v2: Actually report the requested and granted frequencies rather than
showing granted twice (review feedback from Badal).
v3: Locally code all the timeout and end condition handling because a
helper function is not allowed (review feedback from Lucas/Rodrigo).
v4: Add more documentation comments and rename a define to add units
(review feedback from Lucas).
v5: Fix copy/paste error in xe_mmio_wait32_not (review feedback from
Lucas) and rebase (no more return value from guc_wait_ucode).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240518043700.3264362-3-John.C.Harrison@Intel.com
|
|
In upcoming patches we will add support to the VF driver to
read its configuration from the GuC using special H2G actions.
Add necessary definitions to our GuC firmware ABI header.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240516110546.2216-4-michal.wajdeczko@intel.com
|
|
The version negotiation between the VF driver and the GuC firmware
must start with explicit soft reset of the GuC state initiated by
the VF driver. Add VF2GUC action definitions to the ABI header.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240516110546.2216-3-michal.wajdeczko@intel.com
|
|
In upcoming patches we will add a version negotiation between
the VF driver and the GuC firmware. Add necessary definitions
to our GuC firmware ABI header.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240516110546.2216-2-michal.wajdeczko@intel.com
|
|
When thresholds used to monitor VFs activities are configured,
then GuC may send GUC2PF_ADVERSE_EVENT messages informing the
PF driver about exceeded thresholds. Add necessary definitions
to our GuC firmware ABI header.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-7-michal.wajdeczko@intel.com
|
|
Apart from the obvious spelling typo, use the correct values for
infinity quantum/timeout settings (it's 0x0 instead of 0xFFFFFFFF).
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240424140506.2133-1-michal.wajdeczko@intel.com
|
|
GuC firmware specification says that maximum value for the execution
quantum KLV is 100s and anything exceeding that will be clamped.
The same limitation applies to the preemption timeout KLV.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240419123543.270-2-michal.wajdeczko@intel.com
|
|
This initial GuC Relay ABI specification includes messages for ABI
version negotiation and to query values of runtime/fuse registers.
We will start handling those messages on the PF driver soon.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423180436.2089-2-michal.wajdeczko@intel.com
|
|
Enable WA for a bug that could cause the C6 state machine to hang
during RC6 exit.
v2: Add comment clarifying the WA (John H)
v3: Add more details to the comment (John H)
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240417054802.1766359-1-vinay.belgaumkar@intel.com
|
|
There are a couple of new workarounds for LNL that are implemented in
the GuC firmware. The KMD needs to enable them explicitly.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240410002646.3002394-2-John.C.Harrison@Intel.com
|
|
In upcoming patches the PF driver will add support to change VFs
configuration and will need to use PF2GUC_UPDATE_VF_CFG messages.
Add necessary definitions to our GuC firmware ABI header.
Definitions of the GuC VF Configuration KLVs used by this action
are already present in abi/guc_klvs_abi.h
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240415173937.1287-5-michal.wajdeczko@intel.com
|
|
In upcoming patches the PF driver will add support to change GuC
policies and will need to use PF2GUC_UPDATE_VGT_POLICY messages.
Add necessary definitions to our GuC firmware ABI header.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240410170338.1199-4-michal.wajdeczko@intel.com
|
|
Enable GuC Wa_14019882105 to block interrupts during C6 flow
when the memory path has been blocked
v2: Make helper function generic and name it as
guc_waklv_enable_simple (John Harrison)
v3: Make warning descriptive (John Harrison)
v4: s/drm_WARN/xe_gt_WARN/ (Michal)
Cc: John Harrison <john.harrison@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240405084231.3620848-3-badal.nilawar@intel.com
|
|
In upcoming patches the PF driver will add support to handle the
GUC2PF_VF_STATE_NOTIFY events and to send PF2GUC_VF_CONTROL request
messages. Add necessary definitions to our GuC firmware ABI header.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240326191518.363-2-michal.wajdeczko@intel.com
|
|
Use include guard macro name that follows naming used by the other
GuC ABI files.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213214908.1481-1-michal.wajdeczko@intel.com
|
|
All GuC ABI definitions are unsigned and not defining as unsigned is
causing build errors [1].
[1] https://lore.kernel.org/all/20240123111235.3097079-1-geert@linux-m68k.org/
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240131025424.2087936-1-matthew.brost@intel.com
|
|
The GSC uC needs to communicate with the CSME to perform certain
operations. Since the GSC can't perform this communication directly on
platforms where it is integrated in GT, the graphics driver needs to
transfer the messages from GSC to CSME and back. The proxy flow must be
manually started after the GSC is loaded to signal to GSC that we're
ready to handle its messages and allow it to query its init data from
CSME.
Note that the component must be removed before the pci_remove call
completes, so we can't use a drmm helper for it and we need to instead
perform the cleanup as part of the removal flow.
v2: add function documentation, more targeted memory clear, clearer logs
and variable names (Alan)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240117182621.2653049-2-daniele.ceraolospurio@intel.com
|
|
The communication between Virtual Function (VF) drivers and
Physical Function (PF) drivers is based on the GuC firmware
acting as a proxy (relay) agent.
Add related ABI definitions that we will be using in upcoming
patches with our GuC Relay implementation.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20240104222031.277-7-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
|
|
In upcoming new GuC ABI definitions we will want to refer to max
number of dwords that could fit into CTB HXG message. Add explicit
definition named as GUC_CTB_MAX_DWORDS and start using it.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20240104222031.277-6-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
|
|
We're currently sending non-blocking H2G messages using the EVENT type,
which suppresses all CTB protocol replies from the GuC, including the
failure cases. This might cause errors to slip through and manifest as
unexpected behavior (e.g. a context state might not be what the driver
thinks it is because the state change command was silently rejected by
the GuC). To avoid this kind of problems, we can use the FAST_REQUEST
type instead, which suppresses the reply only on success; this way we
still get the advantage of not having to wait for an ack from the GuC
(i.e. the H2G is still non-blocking) while still detecting errors.
Since we can't escalate to the caller when a non-blocking message
fails, we need to escalate to GT reset instead.
Note that FAST_REQUEST failures are NOT expected and are usually a sign
that the H2G was either malformed or requested an illegal operation.
v2: assign fence values to FAST_REQUEST messages, fix abi doc, use xe_gt
printers (Michal).
v3: fix doc alignment, fix and improve prints (Michal)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v2
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
|
|
HuC authentication via GSC is performed by submitting the appropriate
PXP packet to the GSC FW. This packet can trigger a "pending" reply from
the FW, so we need to handle that and resubmit. Note that the auth via
GSC can only be performed if the HuC has already been authenticated by
the GuC.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Vivaik Balasubrawmanian <vivaik.balasubrawmanian@intel.com>
Reviewed-by: Vivaik Balasubrawmanian <vivaik.balasubrawmanian@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|