summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gt/uc
AgeCommit message (Collapse)Author
2023-06-26drm/i915/guc/slpc: Apply min softlimit correctlyVinay Belgaumkar
The scenario being fixed here is depicted in the following sequence- modprobe i915 echo 1 > /sys/class/drm/card0/gt/gt0/slpc_ignore_eff_freq echo 300 > /sys/class/drm/card0/gt_min_freq_mhz (RPn) cat /sys/class/drm/card0/gt_cur_freq_mhz --> cur == RPn as expected echo 1 > /sys/kernel/debug/dri/0/gt0/reset --> reset cat /sys/class/drm/card0/gt_min_freq_mhz --> cached freq is RPn cat /sys/class/drm/card0/gt_cur_freq_mhz --> it's not RPn, but RPe!! When SLPC reinitializes, it sets SLPC min freq to efficient frequency. Even if we disable efficient freq post that, we should restore the cached min freq (via H2G) for it to take effect. v2: Clarify commit message (Ashutosh) Fixes: 95ccf312a1e4 ("drm/i915/guc/slpc: Allow SLPC to use efficient frequency") Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230621014257.1769564-1-vinay.belgaumkar@intel.com (cherry picked from commit da86b2b13f1d1ca26745b951ac94421f3137539a) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2023-06-20drm/i915/huc: Fix missing error code in intel_huc_init()Harshit Mogalapalli
Smatch warns: drivers/gpu/drm/i915/gt/uc/intel_huc.c:388 intel_huc_init() warn: missing error code 'err' When the allocation of VMAs fail: The value of err is zero at this point and it is passed to PTR_ERR and also finally returning zero which is success instead of failure. Fix this by adding the missing error code when VMA allocation fails. Fixes: 08872cb13a71 ("drm/i915/mtl/huc: auth HuC via GSC") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230614223646.2583633-1-daniele.ceraolospurio@intel.com (cherry picked from commit ce432fd34cc6c7b7af06d1403ec0be19d1e518dc) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2023-06-20drm/i915/gsc: take a wakeref for the proxy-init-completion checkAlan Previn
Ensure intel_gsc_uc_fw_init_done and intel_gsc_uc_fw_proxy_init takes a wakeref before reading GSC Shim registers. NOTE: another patch in review also adds a call from selftest to this same function. (https://patchwork.freedesktop.org/series/117713/) which is why i am adding the wakeref inside the callee, not the caller. v2: - add a helper, 'gsc_uc_get_fw_status' for both callers (Daniele Ceraolo) Fixes: 99afb7cc8c44 ("drm/i915/pxp: Add ARB session creation and cleanup") Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230608230716.3079594-1-alan.previn.teres.alexis@intel.com (cherry picked from commit 8c33c3755b75c98d8eb490df345b4187a295a1a8) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2023-06-10Merge drm/drm-next into drm-intel-nextJani Nikula
Sync up with changes from drm-intel-gt-next. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2023-06-09Merge tag 'drm-intel-gt-next-2023-06-08' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next UAPI Changes: - I915_GEM_CREATE_EXT_SET_PAT for Mesa on Meteorlake. Driver Changes: Fixes/improvements/new stuff: - Use large rings for compute contexts (Chris Wilson) - Better logging/debug of unexpected GuC communication issues (Michal Wajdeczko) - Clear out entire reports after reading if not power of 2 size (Ashutosh Dixit) - Limit lmem allocation size to succeed on SmallBars (Andrzej Hajda) - perf/OA capture robustness improvements on DG2 (Umesh Nerlige Ramappa) - Fix error code in intel_gsc_uc_heci_cmd_submit_nonpriv() (Dan Carpenter) Future platform enablement: - Add workaround 14016712196 (Tejas Upadhyay) - HuC loading for MTL (Daniele Ceraolo Spurio) - Allow user to set cache at BO creation (Fei Yang) Miscellaneous: - Use system include style for drm headers (Jani Nikula) - Drop legacy CTB definitions (Michal Wajdeczko) - Turn off the timer to sample frequencies when GT is parked (Ashutosh Dixit) - Make PMU sample array two-dimensional (Ashutosh Dixit) - Use the correct error value when kernel_context() fails (Andi Shyti) - Fix second parameter type of pre-gen8 pte_encode callbacks (Nathan Chancellor) - Fix parameter in gmch_ggtt_insert_{entries, page}() (Nathan Chancellor) - Fix size_t format specifier in gsccs_send_message() (Nathan Chancellor) - Use the fdinfo helper (Tvrtko Ursulin) - Add some missing error propagation (Tvrtko Ursulin) - Reduce I915_MAX_GT to 2 (Matt Atwood) - Rename I915_PMU_MAX_GTS to I915_PMU_MAX_GT (Matt Atwood) - Remove some obsolete definitions (John Harrison) Merges: - Merge drm/drm-next into drm-intel-gt-next (Tvrtko Ursulin) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZIH09fqe5v5yArsu@tursulin-desk
2023-06-08drm/i915/gsc: Fix error code in intel_gsc_uc_heci_cmd_submit_nonpriv()Dan Carpenter
This should return negative -EAGAIN instead of positive EAGAIN. Fixes: e5e1e6d28ebc ("drm/i915/pxp: Add MTL helpers to submit Heci-Cmd-Packet to GSC") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZH7sr+Vs4zOQoouU@moroto
2023-06-07drm/i915/gt/uc: drop unused but set variable sseuJani Nikula
Prepare for re-enabling -Wunused-but-set-variable. Apparently sseu is leftover from commit 9a92732f040a ("drm/i915/gt: Add general DSS steering iterator to intel_gt_mcr"). Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/d542f25bffd5a50ff621bee93415a972c7768a2a.1685119007.git.jani.nikula@intel.com
2023-06-06drm/i915/guc: Remove some obsolete definitionsJohn Harrison
There were a bunch of defines and structures left over from an API update a very long time ago. Remove them. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230531155942.441862-1-John.C.Harrison@Intel.com
2023-06-05drm/i915/huc: define HuC FW version for MTLDaniele Ceraolo Spurio
Follow the same logic as DG2, so just a meu binary with no version number. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230531235415.1467475-8-daniele.ceraolospurio@intel.com
2023-06-05drm/i915/mtl/huc: auth HuC via GSCDaniele Ceraolo Spurio
The full authentication via the GSC requires an heci packet submission to the GSC FW via the GSC CS. The GSC has new PXP command for this (literally called NEW_HUC_AUTH). The intel_huc_auth function is also updated to handle both authentication types. v2: check that the GuC auth for clear media has completed before proceding with the full auth v3: use a define for the object size (Alan) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230531235415.1467475-6-daniele.ceraolospurio@intel.com
2023-06-05drm/i915/huc: differentiate the 2 steps of the MTL HuC auth flowDaniele Ceraolo Spurio
Before we add the second step of the MTL HuC auth (via GSC), we need to have the ability to differentiate between them. To do so, the huc authentication check is duplicated for GuC and GSC auth, with GSC-enabled binaries being considered fully authenticated only after the GSC auth step. To report the difference between the 2 auth steps, a new case is added to the HuC getparam. This way, the clear media driver can start submitting before full auth, as partial auth is enough for those workloads. v2: fix authentication status check for DG2 v3: add a better comment at the top of the HuC file to explain the different approaches to load and auth (John) v4: update call to intel_huc_is_authenticated in the pxp code to check for GSC authentication v5: drop references to meu and esclamation mark in huc_auth print (John) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> #v2 Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230531235415.1467475-5-daniele.ceraolospurio@intel.com
2023-06-05drm/i915/huc: Load GSC-enabled HuC via DMA xfer if the fuse says soDaniele Ceraolo Spurio
In the previous patch we extracted the offset of the legacy-style HuC binary located within the GSC-enabled blob, so now we can use that to load the HuC via DMA if the fuse is set that way. Note that we now need to differentiate between "GSC-enabled binary" and "loaded by GSC", so the former case has been renamed to "has GSC headers" for clarity, while the latter is now based on the fuse instead of the binary format. This way, all the legacy load paths are automatically taken (including the auth by GuC) without having to implement further code changes. v2: s/is_meu_binary/has_gsc_headers/, clearer logs (John) v3: split check for GSC access, better comments (John) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230531235415.1467475-4-daniele.ceraolospurio@intel.com
2023-06-05drm/i915/huc: Parse the GSC-enabled HuC binaryDaniele Ceraolo Spurio
The new binaries that support the 2-step authentication contain the legacy-style binary, which we can use for loading the HuC via DMA. To find out where this is located in the image, we need to parse the manifest of the GSC-enabled HuC binary. The manifest consist of a partition header followed by entries, one of which contains the offset we're looking for. Note that the DG2 GSC binary contains entries with the same names, but it doesn't contain a full legacy binary, so we need to skip assigning the dma offset in that case (which we can do by checking the ccs). Also, since we're now parsing the entries, we can extract the HuC version that way instead of using hardcoded offsets. Note that the GSC binary uses the same structures in its binary header, so they've been added in their own header file. v2: fix structure names to match meu defines (s/CPT/CPD/), update commit message, check ccs validity, drop old version location defines. v3: drop references to the MEU tool to reduce confusion, fix log (John) v4: fix log for real (John) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> #v2 Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230531235415.1467475-3-daniele.ceraolospurio@intel.com
2023-06-05drm/i915/uc: perma-pin firmwaresDaniele Ceraolo Spurio
Now that each FW has its own reserved area, we can keep them always pinned and skip the pin/unpin dance on reset. This will make things easier for the 2-step HuC authentication, which requires the FW to be pinned in GGTT after the xfer is completed. Since the vma is now valid for a long time and not just for the quick pin-load-unpin dance, the name "dummy" is no longer appropriare and has been replaced with vma_res. All the functions have also been updated to operate on vma_res for consistency. Given that we pin the vma behind the allocator's back (which is ok because we do the pinning in an area that was previously reserved for thus purpose), we do need to explicitly re-pin on resume because the automated helper won't cover us. v2: better comments and commit message, s/dummy/vma_res/ Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230531235415.1467475-2-daniele.ceraolospurio@intel.com
2023-06-05Merge drm/drm-next into drm-intel-gt-nextTvrtko Ursulin
For conflict avoidance we need the following commit: c9a9f18d3ad8 drm/i915/huc: use const struct bus_type pointers Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2023-05-31drm/i915/guc: Drop legacy CTB definitionsMichal Wajdeczko
We've already switched to new HXG definitions some time ago, drop legacy CTB definitions to avoid mistakes. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230509201103.538-1-michal.wajdeczko@intel.com
2023-05-31Merge drm/drm-next into drm-intel-nextJani Nikula
Sync the drm-intel-gt-next changes back to drm-intel-next via drm-next. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2023-05-30drm/i915/guc: Track all sent actions to GuCMichal Wajdeczko
For easier debug of any unexpected error responses from GuC that might be related to non-blocking fast requests, track action code (and stack if under DEBUG_GUC config) for every H2G request. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230526235538.2230780-4-John.C.Harrison@Intel.com
2023-05-30drm/i915/guc: Update log for unsolicited CTB responseMichal Wajdeczko
Instead of printing message fence twice, include HXG header of the unexpected message and its len. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230526235538.2230780-3-John.C.Harrison@Intel.com
2023-05-30drm/i915/guc: Use FAST_REQUEST for non-blocking H2G callsMichal Wajdeczko
In addition to the already defined REQUEST HXG message format, which is used when sender expects some confirmation or data, HXG protocol includes definition of the FAST REQUEST message, that may be used when sender does not expect any useful data to be returned. Using this instead of GUC_HXG_TYPE_EVENT for non-blocking CTB requests will allow GuC to send back GUC_HXG_TYPE_RESPONSE_FAILURE in case of errors. Note that it is not possible to return such errors to the caller, since this is for non-blocking calls and the related fence is not stored. Instead such messages are treated as unexpected, which will give an indication of potential GuC misprogramming that warrants extra debugging effort. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230526235538.2230780-2-John.C.Harrison@Intel.com
2023-05-29Merge tag 'drm-intel-gt-next-2023-05-24' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next UAPI Changes: - New getparam for querying PXP support and load status Cross-subsystem Changes: - GSC/MEI proxy driver Driver Changes: Fixes/improvements/new stuff: - Avoid clearing pre-allocated framebuffers with the TTM backend (Nirmoy Das) - Implement framebuffer mmap support (Nirmoy Das) - Disable sampler indirect state in bindless heap (Lionel Landwerlin) - Avoid out-of-bounds access when loading HuC (Lucas De Marchi) - Actually return an error if GuC version range check fails (John Harrison) - Get mutex and rpm ref just once in hwm_power_max_write (Ashutosh Dixit) - Disable PL1 power limit when loading GuC firmware (Ashutosh Dixit) - Block in hwmon while waiting for GuC reset to complete (Ashutosh Dixit) - Provide sysfs for SLPC efficient freq (Vinay Belgaumkar) - Add support for total context runtime for GuC back-end (Umesh Nerlige Ramappa) - Enable fdinfo for GuC backends (Umesh Nerlige Ramappa) - Don't capture Gen8 regs on Xe devices (John Harrison) - Fix error capture for virtual engines (John Harrison) - Track patch level versions on reduced version firmware files (John Harrison) - Decode another GuC load failure case (John Harrison) - GuC loading and firmware table handling fixes (John Harrison) - Fix confused register capture list creation (John Harrison) - Dump error capture to kernel log (John Harrison) - Dump error capture to dmesg on CTB error (John Harrison) - Disable rps_boost debugfs when SLPC is used (Vinay Belgaumkar) Future platform enablement: - Disable stolen memory backed FB for A0 [mtl] (Nirmoy Das) - Various refactors for multi-tile enablement (Andi Shyti, Tejas Upadhyay) - Extend Wa_22011802037 to MTL A-step (Madhumitha Tolakanahalli Pradeep) - WA to clear RDOP clock gating [mtl] (Haridhar Kalvala) - Set has_llc=0 [mtl] (Fei Yang) - Define MOCS and PAT tables for MTL (Madhumitha Tolakanahalli Pradeep) - Add PTE encode function [mtl] (Fei Yang) - fix mocs selftest [mtl] (Fei Yang) - Workaround coherency issue for Media [mtl] (Fei Yang) - Add workaround 14018778641 [mtl] (Tejas Upadhyay) - Implement Wa_14019141245 [mtl] (Radhakrishna Sripada) - Fix the wa number for Wa_22016670082 [mtl] (Radhakrishna Sripada) - Use correct huge page manager for MTL (Jonathan Cavitt) - GSC/MEI support for Meteorlake (Alexander Usyskin, Daniele Ceraolo Spurio) - Define GuC firmware version for MTL (John Harrison) - Drop FLAT CCS check [mtl] (Pallavi Mishra) - Add MTL for remapping CCS FBs [mtl] (Clint Taylor) - Meteorlake PXP enablement (Alan Previn) - Do not enable render power-gating on MTL (Andrzej Hajda) - Add MTL performance tuning changes (Radhakrishna Sripada) - Extend Wa_16014892111 to MTL A-step (Radhakrishna Sripada) - PMU multi-tile support (Tvrtko Ursulin) - End support for set caching ioctl [mtl] (Fei Yang) Driver refactors: - Use i915 instead of dev_priv insied the file_priv structure (Andi Shyti) - Use proper parameter naming in for_each_engine() (Andi Shyti) - Use gt_err for GT info (Tejas Upadhyay) - Consolidate duplicated capture list code (John Harrison) - Capture list naming clean up (John Harrison) - Use kernel-doc -Werror when CONFIG_DRM_I915_WERROR=y (Jani Nikula) - Preparation for using PAT index (Fei Yang) - Use pat_index instead of cache_level (Fei Yang) Miscellaneous: - Fix memory leaks in i915 selftests (Cong Liu) - Record GT error for gt failure (Tejas Upadhyay) - Migrate platform-dependent mock hugepage selftests to live (Jonathan Cavitt) - Update the SLPC selftest (Vinay Belgaumkar) - Throw out set() wrapper (Jani Nikula) - Large driver kernel doc cleanup (Jani Nikula) - Fix probe injection CI failures after recent change (John Harrison) - Make unexpected firmware versions an error in debug builds (John Harrison) - Silence UBSAN uninitialized bool variable warning (Ashutosh Dixit) - Fix memory leaks in function live_nop_switch (Cong Liu) Merges: - Merge drm/drm-next into drm-intel-gt-next (Joonas Lahtinen) Signed-off-by: Dave Airlie <airlied@redhat.com> # Conflicts: # drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZG5SxCWRSkZhTDtY@tursulin-desk
2023-05-26drm/i915/gsc: use system include style for drm headersJani Nikula
Use <> instead of "" for including headers from include/. Fixes: 8a9bf29546a1 ("drm/i915/gsc: add initial support for GSC proxy") Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Luca Coelho <luciano.coelho@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230525094942.941123-1-jani.nikula@intel.com
2023-05-17drm/i915/guc/slpc: Disable rps_boost debugfsVinay Belgaumkar
rps_boost debugfs shows host turbo related info. This is not valid when SLPC is enabled. guc_slpc_info already shows the number of boosts. Add num_waiters there as well and disable rps_boost when SLPC is enabled. v2: Replace Bug with Link to resolve checkpatch warning Link: https://gitlab.freedesktop.org/drm/intel/-/issues/7632 Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230516154905.1048006-1-vinay.belgaumkar@intel.com
2023-05-17Merge drm/drm-next into drm-intel-nextRodrigo Vivi
Backmerge to get some hwmon dependencies. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-05-16drm/i915/guc: Dump error capture to dmesg on CTB errorJohn Harrison
In the past, There have been sporadic CTB failures which proved hard to reproduce manually. The most effective solution was to dump the GuC log at the point of failure and let the CI system do the repro. It is preferable not to dump the GuC log via dmesg for all issues as it is not always necessary and is not helpful for end users. But rather than trying to re-invent the code to do this each time it is wanted, commit the code but for DEBUG_GUC builds only. v2: Use IS_ENABLED for testing config options. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230418181744.3251240-3-John.C.Harrison@Intel.com
2023-05-15drm/i915/hwmon: Silence UBSAN uninitialized bool variable warningAshutosh Dixit
Loading i915 on UBSAN enabled kernels (CONFIG_UBSAN/CONFIG_UBSAN_BOOL) causes the following warning: UBSAN: invalid-load in drivers/gpu/drm/i915/gt/uc/intel_uc.c:558:2 load of value 255 is not a valid value for type '_Bool' Call Trace: dump_stack_lvl+0x57/0x7d ubsan_epilogue+0x5/0x40 __ubsan_handle_load_invalid_value.cold+0x43/0x48 __uc_init_hw+0x76a/0x903 [i915] ... i915_driver_probe+0xfb1/0x1eb0 [i915] i915_pci_probe+0xbe/0x2d0 [i915] The warning happens because during probe i915_hwmon is still not available which results in the output boolean variable *old remaining uninitialized. Silence the warning by initializing the variable to an arbitrary value. v2: Move variable initialization to the declaration (Andi) Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230512203735.2635237-1-ashutosh.dixit@intel.com
2023-05-12drm/i915/guc: Fix confused register capture list creationJohn Harrison
The GuC has a completely separate engine class enum when referring to register capture lists, which combines render and compute. The driver was using the 'normal' GuC specific engine class enum instead. That meant that it thought it was defining a capture list for compute engines, the list was actually being applied to the GSC engine. And if a platform didn't have a render engine, then it would get no compute register captures at all. Fix that. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230512013544.3367606-1-John.C.Harrison@Intel.com
2023-05-12drm/i1915/guc: Fix probe injection CI failures after recent changeJohn Harrison
A recent change bumped a 'notice' message up to 'error' level for debug builds to help trap incorrect configurations in CI systems. Unfortunately, the error condition in question is triggered by the error injection probe test. So change the message again to be 'probe error' level instead. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Fixes: 760133d42f0a ("drm/i915/uc: Make unexpected firmware versions an error in debug builds") Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230510205556.312999-1-John.C.Harrison@Intel.com
2023-05-11drm/i915/pxp: Add GSC-CS backend to send GSC fw messagesAlan Previn
Add GSC engine based method for sending PXP firmware packets to the GSC firmware for MTL (and future) products. Use the newly added helpers to populate the GSC-CS memory header and send the message packet to the FW by dispatching the GSC_HECI_CMD_PKT instruction on the GSC engine. We use non-priveleged batches for submission to GSC engine which require two buffers for the request: - a buffer for the HECI packet that contains PXP FW commands - a batch-buffer that contains the engine instruction for sending the HECI packet to the GSC firmware. Thus, add the allocation and freeing of these buffers in gsccs init and fini. The GSC-fw may reply to commands with a SUCCESS but with an additional pending-bit set in the reply packet. This bit means the GSC-FW is currently busy and the caller needs to try again with the gsc_message_handle the fw returned. Thus, add a wrapper to continuously retry send_message while replaying the gsc_message_handle. Retries need to follow the arch-spec count and delay until GSC-FW replies with the real SUCCESS or timeout after that spec'd delay. The GSC-fw requires a non-zero host_session_handle provided by the caller to enable gsc_message_handle tracking. Thus, allocate the host_session_handle at init and destroy it at fini (the latter requiring an FYI to the gsc-firmware). Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230511231738.1077674-5-alan.previn.teres.alexis@intel.com
2023-05-11drm/i915/pxp: Add MTL helpers to submit Heci-Cmd-Packet to GSCAlan Previn
Add helper functions into a new file for heci-packet-submission. The helpers will handle generating the MTL GSC-CS Memory-Header and submission of the Heci-Cmd-Packet instructions to the engine. NOTE1: These common functions for heci-packet-submission will be used by different i915 callers: 1- GSC-SW-Proxy: This is pending upstream publication awaiting a few remaining opens 2- MTL-HDCP: An equivalent patch has also been published at: https://patchwork.freedesktop.org/series/111876/. (Patch 1) 3- PXP: This series. NOTE2: A difference in this patch vs what is appearing is in bullet 2 above is that HDCP (and SW-Proxy) will be using priveleged submission (GGTT and common gsc-uc-context) while PXP will be using non-priveleged PPGTT, context and batch buffer. Therefore this patch will only slightly overlap with the MTL-HDCP patches despite have very similar function names (emit_foo vs emit_nonpriv_foo). This is because HECI_CMD_PKT instructions require different flows and hw-specific code when done via PPGTT based submission (not different from other engines). MTL-HDCP contains the same intel_gsc_mtl_header_t structures as this but the helpers there are different. Both add the same new file names. NOTE3: Additional clarity about the heci-cmd-pkt layout and where the common helpers come in: - On MTL, when an i915 subsystem needs to send a command request to the security firmware, it will send that via the GSC- engine-command-streamer. - However those commands, (lets call them "gsc_specific_fw_api" calls), are not understood by the GSC command streamer hw. - The GSC CS only looks at the GSC_HECI_CMD_PKT instruction and passes it along to the GSC firmware. - The GSC FW on the other hand needs additional metadata to know which usage service is being called (PXP, HDCP, proxy, etc) along with session specific info. Thus an extra header called GSC-CS HECI Memory Header, (C) in below diagram is prepended before the FW specific API, (D). - Thus, the structural layout of the request submitted would need to look like the diagram below (for non-priv PXP). - In the diagram, the common helper for HDCP, (GSC-Sw-Proxy) and PXP (i.e. new function intel_gsc_uc_heci_cmd_emit_mtl_header) will populate blob (C) while additional helpers, different for PPGGTT (this patch) vs GGTT (HDCP series) will populate blobs (A) and (B) below. ___________________________________________________________ (A) | MI_BATCH_BUFFER_START (ppgtt, batchbuff-addr, ...) | | | | | _|________________________________________________ | | (B)| GSC_HECI_CMD_PKT (pkt-addr-in, pkt-size-in, | | | | pkt-addr-out, pkt-size-out) |-------- | | MI_BATCH_BUFFER_END | | | | |________________________________________________| | | | | | |_________________________________________________________| | | --------------------------------------------------------- | \|/ ______V___________________________________________ | _________________________________________ | |(C)| | | | | struct intel_gsc_mtl_header { | | | | validity marker | | | | heci_clent_id | | | | ... | | | | } | | | |_______________________________________| | |(D)| | | | | struct gsc_fw_specific_api_foobar { | | | | ... | | | | For an example, see | | | | 'struct pxp43_create_arb_in' at | | | | intel_pxp_cmd_interface_43.h | | | | | | | | } | | | | Struture depends on command type | | | | struct gsc_fw_specific_api_foobar { | | | |_______________________________________| | |________________________________________________| That said, this patch provides basic helpers but leaves the PXP subsystem (i.e. the caller) to handle (D) and everything else such as input/output size verification or handling the responses from security firmware (for example, requiring a retry). Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230511231738.1077674-4-alan.previn.teres.alexis@intel.com
2023-05-11drm/i915: use pat_index instead of cache_levelFei Yang
Currently the KMD is using enum i915_cache_level to set caching policy for buffer objects. This is flaky because the PAT index which really controls the caching behavior in PTE has far more levels than what's defined in the enum. In addition, the PAT index is platform dependent, having to translate between i915_cache_level and PAT index is not reliable, and makes the code more complicated. From UMD's perspective there is also a necessity to set caching policy for performance fine tuning. It's much easier for the UMD to directly use PAT index because the behavior of each PAT index is clearly defined in Bspec. Having the abstracted i915_cache_level sitting in between would only cause more ambiguity. PAT is expected to work much like MOCS already works today, and by design userspace is expected to select the index that exactly matches the desired behavior described in the hardware specification. For these reasons this patch replaces i915_cache_level with PAT index. Also note, the cache_level is not completely removed yet, because the KMD still has the need of creating buffer objects with simple cache settings such as cached, uncached, or writethrough. For kernel objects, cache_level is used for simplicity and backward compatibility. For Pre-gen12 platforms PAT can have 1:1 mapping to i915_cache_level, so these two are interchangeable. see the use of LEGACY_CACHELEVEL. One consequence of this change is that gen8_pte_encode is no longer working for gen12 platforms due to the fact that gen12 platforms has different PAT definitions. In the meantime the mtl_pte_encode introduced specfically for MTL becomes generic for all gen12 platforms. This patch renames the MTL PTE encode function into gen12_pte_encode and apply it to all gen12. Even though this change looks unrelated, but separating them would temporarily break gen12 PTE encoding, thus squash them in one patch. Special note: this patch changes the way caching behavior is controlled in the sense that some objects are left to be managed by userspace. For such objects we need to be careful not to change the userspace settings.There are kerneldoc and comments added around obj->cache_coherent, cache_dirty, and how to bypass the checkings by i915_gem_object_has_cache_level. For full understanding, these changes need to be looked at together with the two follow-up patches, one disables the {set|get}_caching ioctl's and the other adds set_pat extension to the GEM_CREATE uAPI. Bspec: 63019 Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230509165200.1740-3-fei.yang@intel.com
2023-05-11drm/i915/guc: Don't capture Gen8 regs on Xe devicesJohn Harrison
A pair of pre-Xe registers were being included in the Xe capture list. GuC was rejecting those as being invalid and logging errors about them. So, stop doing it. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Fixes: dce2bd542337 ("drm/i915/guc: Add Gen9 registers for GuC error state capture.") Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230428185636.457407-2-John.C.Harrison@Intel.com (cherry picked from commit b049132d61336f643d8faf2f6574b063667088cf) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2023-05-05drm/i915/uc: Make unexpected firmware versions an error in debug buildsJohn Harrison
If the DEBUG_GEM config option is set then escalate the 'unexpected firmware version' message from a notice to an error. This will ensure that the CI system treats such occurences as a failure and logs a bug about it (or fails the pre-merge testing). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230502234007.1762014-7-John.C.Harrison@Intel.com
2023-05-05drm/i915/uc: Reject duplicate entries in firmware tableJohn Harrison
It was noticed that duplicate entries in the firmware table could cause an infinite loop in the firmware loading code if that entry failed to load. Duplicate entries are a bug anyway and so should never happen. Ensure they don't by tweaking the table validation code to reject duplicates. For full m/m/p files, that can be done by simply tweaking the patch level check to reject matching values. For reduced version entries, the filename itself must be compared. v2: Improve comment (review by Daniele) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230502234007.1762014-6-John.C.Harrison@Intel.com
2023-05-05drm/i915/uc: Enhancements to firmware table validationJohn Harrison
The validation of the firmware table was being done inside the code for scanning the table for the next available firmware blob. Which is unnecessary. So pull it out into a separate function that is only called once per blob type at init time. Also, drop the CONFIG_SELFTEST requirement and make errors terminal. It was mentioned that potential issues with backports would not be caught by regular pre-merge CI as that only occurs on tip not stable branches. Making the validation unconditional and failing driver load on detecting of a problem ensures that such backports will also be validated correctly. This requires adding a firmware global flag to indicate an issue with any of the per firmware tables. This is done rather than adding a new state enum as a new enum value would be a much more invasive change - lots of places would need updating to support the new error state. Note also that this change means that a table error will cause the driver to wedge even on platforms that don't require firmware files. This is intentional as per the above backport concern - someone doing backports is not guaranteed to test on every platform that they may potential affect. So forcing a failure on all platforms ensures that the problem will be noticed and corrected immediately. v2: Change to unconditionally fail module load on a validation error (review feedback/discussion with Daniele). v3: Add a new flag to track table validation errors (review feedback/discussion with Daniele). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230502234007.1762014-5-John.C.Harrison@Intel.com
2023-05-05drm/i915/guc: Print status register when waiting for GuC to loadJohn Harrison
If the GuC load is taking an excessively long time, the wait loop currently prints the GT frequency. Extend that to include the GuC status as well so we can see if the GuC is actually making progress or not. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230502234007.1762014-3-John.C.Harrison@Intel.com
2023-05-05drm/i915/guc: Decode another GuC load failure caseJohn Harrison
Explain another potential firmware failure mode and early exit the long wait if hit. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230502234007.1762014-2-John.C.Harrison@Intel.com
2023-05-05Merge tag 'drm-next-2023-05-05' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull more drm fixes from Dave Airlie: "This is the fixes for the last couple of weeks for i915 and last 3 weeks for amdgpu, lots of them but pretty scattered around and all pretty small. amdgpu: - SR-IOV fixes - DCN 3.2 fixes - DC mclk handling fixes - eDP fixes - SubVP fixes - HDCP regression fix - DSC fixes - DC FP fixes - DCN 3.x fixes - Display flickering fix when switching between vram and gtt - Z8 power saving fix - Fix hang when skipping modeset - GPU reset fixes - Doorbell fix when resizing BARs - Fix spurious warnings in gmc - Locking fix for AMDGPU_SCHED IOCTL - SR-IOV fix - DCN 3.1.4 fix - DCN 3.2 fix - Fix job cleanup when CS is aborted i915: - skl pipe source size check - mtl transcoder mask fix - DSI power on sequence fix - GuC versioning corner case fix" * tag 'drm-next-2023-05-05' of git://anongit.freedesktop.org/drm/drm: (48 commits) drm/amdgpu: drop redundant sched job cleanup when cs is aborted drm/amd/display: filter out invalid bits in pipe_fuses drm/amd/display: Change default Z8 watermark values drm/amdgpu: disable SDMA WPTR_POLL_ENABLE for SR-IOV drm/amdgpu: add a missing lock for AMDGPU_SCHED drm/amdgpu: fix an amdgpu_irq_put() issue in gmc_v9_0_hw_fini() drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v10_0_hw_fini drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v11_0_hw_fini drm/amdgpu: Enable doorbell selfring after resize FB BAR drm/amdgpu: Use the default reset when loading or reloading the driver drm/amdgpu: Fix mode2 reset for sienna cichlid drm/i915/dsi: Use unconditional msleep() instead of intel_dsi_msleep() drm/i915/mtl: Add the missing CPU transcoder mask in intel_device_info drm/i915/guc: Actually return an error if GuC version range check fails drm/amd/display: Lowering min Z8 residency time drm/amd/display: fix flickering caused by S/G mode drm/amd/display: Set min_width and min_height capability for DCN30 drm/amd/display: Isolate remaining FPU code in DCN32 drm/amd/display: Update bounding box values for DCN321 drm/amd/display: Do not clear GPINT register when releasing DMUB from reset ...
2023-05-05drm/i915/mtl: Define GuC firmware version for MTLJohn Harrison
First release of GuC for Meteorlake. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230504202252.1104212-3-John.C.Harrison@Intel.com
2023-05-05drm/i915/uc: Track patch level versions on reduced version firmware filesJohn Harrison
When reduced version firmware files were added (matching major component being the only strict requirement), the minor version was still tracked and a notification reported if it was older. However, the patch version should really be tracked as well for the same reasons. The KMD can work without the change but if the effort has been taken to release a new firmware with the change then there must be a valid reason for doing so - important bug fix, security fix, etc. And in that case it would be good to alert the user if they are missing out on that new fix. v2: Use correct patch version number and drop redunant debug print (review by Daniele / CI results). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230504202252.1104212-2-John.C.Harrison@Intel.com
2023-05-04drm/i915/gsc: add support for GSC proxy interruptDaniele Ceraolo Spurio
The GSC notifies us of a proxy request via the HECI2 interrupt. The interrupt must be enabled both in the HECI layer and in our usual gt irq programming; for the latter, the interrupt is enabled via the same enable register as the GSC CS, but it does have its own mask register. When the interrupt is received, we also need to de-assert it in both layers. The handling of the proxy request is deferred to the same worker that we use for GSC load. New flags have been added to distinguish between the init case and the proxy interrupt. v2: Make sure not to set the reset bit when enabling/disabling the GSC interrupts, fix defines (Alan) v3: rebase on proxy status register check Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230502163854.317653-5-daniele.ceraolospurio@intel.com
2023-05-04drm/i915/gsc: add initial support for GSC proxyDaniele Ceraolo Spurio
The GSC uC needs to communicate with the CSME to perform certain operations. Since the GSC can't perform this communication directly on platforms where it is integrated in GT, i915 needs to transfer the messages from GSC to CSME and back. The proxy flow is as follow: 1 - i915 submits a request to GSC asking for the message to CSME 2 - GSC replies with the proxy header + payload for CSME 3 - i915 sends the reply from GSC as-is to CSME via the mei proxy component 4 - CSME replies with the proxy header + payload for GSC 5 - i915 submits a request to GSC with the reply from CSME 6 - GSC replies either with a new header + payload (same as step 2, so we restart from there) or with an end message. After GSC load, i915 is expected to start the first proxy message chain, while all subsequent ones will be triggered by the GSC via interrupt. To communicate with the CSME, we use a dedicated mei component, which means that we need to wait for it to bind before we can initialize the proxies. This usually happens quite fast, but given that there is a chance that we'll have to wait a few seconds the GSC work has been moved to a dedicated WQ to not stall other processes. v2: fix code style, includes and variable naming (Alan) v3: add extra check for proxy status, fix includes and comments Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230502163854.317653-4-daniele.ceraolospurio@intel.com
2023-05-04drm/i915/guc: add intel_guc_state_capture member docs for ads_null_cache and ↵Jani Nikula
max_mmio_per_node drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:216: warning: Function parameter or member 'ads_null_cache' not described in 'intel_guc_state_capture' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:216: warning: Function parameter or member 'max_mmio_per_node' not described in 'intel_guc_state_capture' Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c83878163221ed3684a6de5d5e1c5373ddd5c06f.1683041799.git.jani.nikula@intel.com
2023-05-04drm/i915/guc: drop lots of kernel-doc markersJani Nikula
The documentation is closer to not being kernel-doc, so just drop the kernel-doc markers. drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:27: warning: Function parameter or member 'size' not described in '__guc_capture_bufstate' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:27: warning: Function parameter or member 'data' not described in '__guc_capture_bufstate' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:27: warning: Function parameter or member 'rd' not described in '__guc_capture_bufstate' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:27: warning: Function parameter or member 'wr' not described in '__guc_capture_bufstate' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'link' not described in '__guc_capture_parsed_output' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'is_partial' not described in '__guc_capture_parsed_output' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'eng_class' not described in '__guc_capture_parsed_output' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'eng_inst' not described in '__guc_capture_parsed_output' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'guc_id' not described in '__guc_capture_parsed_output' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'lrca' not described in '__guc_capture_parsed_output' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:59: warning: Function parameter or member 'reginfo' not described in '__guc_capture_parsed_output' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:62: warning: wrong kernel-doc identifier on line: * struct guc_debug_capture_list_header / struct guc_debug_capture_list drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:80: warning: wrong kernel-doc identifier on line: * struct __guc_mmio_reg_descr / struct __guc_mmio_reg_descr_group drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:105: warning: wrong kernel-doc identifier on line: * struct guc_state_capture_header_t / struct guc_state_capture_t / drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:163: warning: Function parameter or member 'is_valid' not described in '__guc_capture_ads_cache' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:163: warning: Function parameter or member 'ptr' not described in '__guc_capture_ads_cache' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:163: warning: Function parameter or member 'size' not described in '__guc_capture_ads_cache' drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h:163: warning: Function parameter or member 'status' not described in '__guc_capture_ads_cache' drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'marker' not described in 'guc_log_buffer_state' drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'read_ptr' not described in 'guc_log_buffer_state' drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'write_ptr' not described in 'guc_log_buffer_state' drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'size' not described in 'guc_log_buffer_state' drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'sampled_write_ptr' not described in 'guc_log_buffer_state' drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'wrap_offset' not described in 'guc_log_buffer_state' drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'flush_to_file' not described in 'guc_log_buffer_state' drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'buffer_full_cnt' not described in 'guc_log_buffer_state' drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'reserved' not described in 'guc_log_buffer_state' drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'flags' not described in 'guc_log_buffer_state' drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h:491: warning: Function parameter or member 'version' not described in 'guc_log_buffer_state' Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/9c210d53fdbd6da5fac42e435855d269504919d7.1683041799.git.jani.nikula@intel.com
2023-05-04drm/i915/guc: add dbgfs_node member kernel-docJani Nikula
drivers/gpu/drm/i915/gt/uc/intel_guc.h:274: warning: Function parameter or member 'dbgfs_node' not described in 'intel_guc' Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/b0f681dd82289dd86da78c6242411e8d812e51a1.1683041799.git.jani.nikula@intel.com
2023-05-03drm/i915/guc: Fix error capture for virtual enginesJohn Harrison
GuC based register dumps in error capture logs were basically broken for virtual engines. This can be seen in igt@gem_exec_balancer@hang: [IGT] gem_exec_balancer: starting subtest hang [drm] GPU HANG: ecode 12:4:e1524110, in gem_exec_balanc [6388] [drm] GT0: GUC: No register capture node found for 0x1005 / 0xFEDC311D [drm] GPU HANG: ecode 12:4:00000000, in gem_exec_balanc [6388] [IGT] gem_exec_balancer: exiting, ret=0 The test causes a hang on both engines of a virtual engine context. The engine instance zero hang gets a valid error capture but the non-instance-zero hang does not. Fix that by scanning through the list of pending register captures when a hang notification for a virtual engine is received. That way, the hang can be assigned to the correct physical engine prior to starting the error capture process. So later on, when the error capture handler tries to find the engine register list, it looks for one on the correct engine. Also, sneak in a missing blank line before a comment in the node search code. v2: Fix null pointer deref on non-GuC platforms. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230428185636.457407-5-John.C.Harrison@Intel.com
2023-05-03drm/i915/guc: Capture list naming clean upJohn Harrison
Don't use 'xe_lp*' prefixes for register lists that are common with Gen8. Don't add Xe only GSC registers to pre-Xe devices that don't even have a GSC engine. Fix Xe_LP name. Don't use GEN9 as a prefix for register lists that contain all GEN8 registers. Rename the 'default_' register list prefix to 'gen8_' as that is the more accurate name. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230428185636.457407-4-John.C.Harrison@Intel.com
2023-05-03drm/i915/guc: Consolidate duplicated capture list codeJohn Harrison
Remove 99% duplicated steered register list code. Also, include the pre-Xe steered registers in the pre-Xe list generation. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230428185636.457407-3-John.C.Harrison@Intel.com
2023-05-03drm/i915/guc: Don't capture Gen8 regs on Xe devicesJohn Harrison
A pair of pre-Xe registers were being included in the Xe capture list. GuC was rejecting those as being invalid and logging errors about them. So, stop doing it. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Fixes: dce2bd542337 ("drm/i915/guc: Add Gen9 registers for GuC error state capture.") Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230428185636.457407-2-John.C.Harrison@Intel.com
2023-05-03drm/i915/guc: Actually return an error if GuC version range check failsJohn Harrison
Dan Carpenter pointed out that 'err' was not being set in the case where the GuC firmware version range check fails. Fix that. Note that while this is a bug fix for a previous patch (see Fixes tag below). It is an exceedingly low risk bug. The range check is asserting that the GuC firmware version is within spec. So it should not be possible to ever have a firmware file that fails this check. If larger version numbers are required in the future, that would be a backwards breaking spec change and thus require a major version bump, in which case an old i915 driver would not load that new version anyway. Fixes: 9bbba0667f37 ("drm/i915/guc: Use GuC submission API version number") Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230421224742.2357198-1-John.C.Harrison@Intel.com (cherry picked from commit 80ab31799002166ac7c660bacfbff4f85bc29107) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>