summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/i915_perf.c
AgeCommit message (Collapse)Author
2023-07-11drm/i915/perf: Consider OA buffer boundary when zeroing out reportsUmesh Nerlige Ramappa
For reports that are not powers of 2, reports at the end of the OA buffer may get split across the buffer boundary. When zeroing out such reports, take the split into consideration. v2: Use OA_BUFFER_SIZE (Ashutosh) Fixes: 09a36015d9a0 ("drm/i915/perf: Clear out entire reports after reading if not power of 2 size") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230616173402.699776-1-umesh.nerlige.ramappa@intel.com (cherry picked from commit 40b1588a750240cbe8a83117aa785d778749a77c) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2023-06-07i915/perf: Do not add ggtt offset to hw_tailUmesh Nerlige Ramappa
ggtt offset for hw_tail is not required for the calculations, so drop it. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230605193923.1836048-3-umesh.nerlige.ramappa@intel.com
2023-06-07i915/perf: Drop the aging_tail logic in perf OAUmesh Nerlige Ramappa
On DG2, capturing OA reports while running heavy render workloads sometimes results in invalid OA reports where 64-byte chunks inside reports have stale values. Under memory pressure, high OA sampling rates (13.3 us) and heavy render workload, occasionally, the OA HW TAIL pointer does not progress as fast as the sampling rate. When these glitches occur, the TAIL pointer takes approx. 200us to progress. While this is expected behavior from the HW perspective, invalid reports are not expected. In oa_buffer_check_unlocked(), when we execute the if condition, we are updating the oa_buffer.tail to the aging tail and then setting pollin based on this tail value, however, we do not have a chance to rewind and validate the reports prior to setting pollin. The validation happens in a subsequent call to oa_buffer_check_unlocked(). If a read occurs before this validation, then we end up reading reports up until this oa_buffer.tail value which includes invalid reports. Though found on DG2, this affects all platforms. The aging tail logic is no longer necessary since we are explicitly checking for landed reports. Start by dropping the aging tail logic. v2: - Drop extra blank line - Add reason to drop aging logic (Ashutosh) - Add bug links (Ashutosh) - rename aged_tail to read_tail - Squash patches 3 and 1 v3: (Ashutosh) - Remove extra spaces - Remove gtt_offset from the pollin calculation - s/Bug:/Link/ in commit message (checkpatch) Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7484 Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7757 Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230605193923.1836048-2-umesh.nerlige.ramappa@intel.com
2023-05-31drm/i915/perf: Clear out entire reports after reading if not power of 2 sizeAshutosh Dixit
Clearing out report id and timestamp as means to detect unlanded reports only works if report size is power of 2. That is, only when report size is a sub-multiple of the OA buffer size can we be certain that reports will land at the same place each time in the OA buffer (after rewind). If report size is not a power of 2, we need to zero out the entire report to be able to detect unlanded reports reliably. v2: Add Fixes tag (Umesh) Fixes: 1cc064dce4ed ("drm/i915/perf: Add support for OA media units") Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230523204042.4180641-1-ashutosh.dixit@intel.com
2023-05-04drm/i915/perf: fix i915_perf_ioctl_version() kernel-docJani Nikula
drivers/gpu/drm/i915/i915_perf.c:5307: warning: Function parameter or member 'i915' not described in 'i915_perf_ioctl_version' Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/b93ddb95a15d1376936349b32c7facb35c76be82.1683041799.git.jani.nikula@intel.com
2023-03-29drm/i915: fix race condition UAF in i915_perf_add_config_ioctlMin Li
Userspace can guess the id value and try to race oa_config object creation with config remove, resulting in a use-after-free if we dereference the object after unlocking the metrics_lock. For that reason, unlocking the metrics_lock must be done after we are done dereferencing the object. Signed-off-by: Min Li <lm0963hack@gmail.com> Fixes: f89823c21224 ("drm/i915/perf: Implement I915_PERF_ADD/REMOVE_CONFIG interface") Cc: <stable@vger.kernel.org> # v4.14+ Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230328093627.5067-1-lm0963hack@gmail.com [tursulin: Manually added stable tag.]
2023-03-24drm/i915/perf: Wa_14017512683: Disable OAM if media C6 is enabled in BIOSUmesh Nerlige Ramappa
OAM does not work with media C6 enabled on some steppings of MTL. Disable OAM if we detect that media C6 was enabled in bios. v2: (Ashutosh) - Remove drm_notice from the driver load path - Log a drm_err when opening an OAM stream on affected steppings v3: - Initialize the engine group even if mc6 is enabled (Ashutosh) - Checkpatch fix Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-12-umesh.nerlige.ramappa@intel.com
2023-03-24drm/i915/perf: Pass i915 object to perf revision helperUmesh Nerlige Ramappa
In some cases, perf revision may rely on specific steppings of a platform. To determine the platform, pass i915 object to the perf revision helper. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-11-umesh.nerlige.ramappa@intel.com
2023-03-24drm/i915/perf: Add support for OA media unitsUmesh Nerlige Ramappa
MTL introduces additional OA units dedicated to media use cases. Add support for programming these OA units by passing the media engine class and instance parameters. UMD specific changes for GPUvis support: https://patchwork.freedesktop.org/patch/522827/?series=114023 https://patchwork.freedesktop.org/patch/522822/?series=114023 https://patchwork.freedesktop.org/patch/522826/?series=114023 https://patchwork.freedesktop.org/patch/522828/?series=114023 https://patchwork.freedesktop.org/patch/522816/?series=114023 https://patchwork.freedesktop.org/patch/522825/?series=114023 v2: (Ashutosh) - check for IP_VER(12, 70) instead of MTL - remove PERF_GROUP_OAG comment in mtl_oa_base - remove oa_buffer.group - use engine->oa_group->type in engine_supports_oa_format - remove fw_domains and use FORCEWAKE_ALL - remove MPES/MPEC comment - s/xehp/mtl/ in b counter validation function name - remove engine_supports_oa in __oa_engine_group - remove warn_ON from __oam_engine_group - refactor oa_init_groups and oa_init_regs - assign g->type correctly - use enum oa_type definition v3: (Ashutosh) - Drop oa_unit_functional as engine_supports_oa is enough v4: - s/DRM_DEBUG/drm_dbg/ Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-10-umesh.nerlige.ramappa@intel.com
2023-03-24drm/i915/perf: Add engine class instance parameters to perfUmesh Nerlige Ramappa
One or more engines map to a specific OA unit. All reports from these engines are captured in the OA buffer managed by this OA unit. Current i915 OA implementation supports only the OAG unit. OAG primarily caters to render engine, so i915 OA uses render as the default engine in the OA implementation. Since there are more OA units on newer hardware that map to other engines, allow user to pass engine class and instance to select and program specific OA units. UMD specific changes for GPUvis support: https://patchwork.freedesktop.org/patch/522827/?series=114023 https://patchwork.freedesktop.org/patch/522822/?series=114023 https://patchwork.freedesktop.org/patch/522826/?series=114023 https://patchwork.freedesktop.org/patch/522828/?series=114023 https://patchwork.freedesktop.org/patch/522816/?series=114023 https://patchwork.freedesktop.org/patch/522825/?series=114023 v2: (Ashutosh) - Clarify commit message - Add drm_dbg - Clarify uapi description v3: (Ashutosh) - Remove irrelevant info from the uapi comment v4: Ensure engine class:instance is passed together (Ashutosh) v5: Remove unnecessary quote (Ashutosh) Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-9-umesh.nerlige.ramappa@intel.com
2023-03-24drm/i915/perf: Handle non-power-of-2 reportsUmesh Nerlige Ramappa
Some of the newer OA formats are not powers of 2. For those formats, adjust the hw_tail accordingly when checking for new reports. v2: (Ashutosh) - Switch to OA_TAKEN for diff calculation - Use OA_BUFFER_SIZE instead of the vma size - Update comments Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-8-umesh.nerlige.ramappa@intel.com
2023-03-24drm/i915/perf: Parse 64bit report header formats correctlyUmesh Nerlige Ramappa
Now that OA formats come in flavor of 64 bit reports, the report header has 64 bit report-id, timestamp, context-id and gpu-ticks fields. When filtering these reports, use the right width for these fields. Note that upper dword of context id is reserved, so squash lower dword only. v2: (Ashutosh) - Drop inline - Update comment with dword definitions - report id and timestamp Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-7-umesh.nerlige.ramappa@intel.com
2023-03-24drm/i915/perf: Fail modprobe if i915_perf_init fails on OOMUmesh Nerlige Ramappa
i915_perf_init can fail due to OOM. Fail driver init if i915_perf_init fails. v2: (Jani) - Reorder patch in the series Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-6-umesh.nerlige.ramappa@intel.com
2023-03-24drm/i915/perf: Group engines into respective OA groupsUmesh Nerlige Ramappa
Now that we may have multiple OA units in a single GT as well as on separate GTs, create an engine group that maps to a single OA unit. v2: (Jani) - Drop warning on ENOMEM - Reorder patch in the series v3: (Ashutosh) - Remove unused members from perf structs - Update comments - Update engine_supports_oa check - Just return 1 in num_perf_groups_per_gt for now - Set engine->oa_group to NULL to begin with v4: Use engine_supports_oa() check in oa_init_reg_state (Ashutosh) v5: Rebase after dropping engine_supports_oa helper Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-5-umesh.nerlige.ramappa@intel.com
2023-03-24drm/i915/perf: Validate OA sseu config outside switchUmesh Nerlige Ramappa
Once OA supports media engine class:instance, the engine can only be validated outside the switch since class and instance parameters are separate entities. Since OA sseu config depends on engine class:instance, validate OA sseu config outside the switch. v2: (Ashutosh) - Clarify commit message - Use drm_dbg instead of DRM_DEBUG - Reorder stack variables Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-4-umesh.nerlige.ramappa@intel.com
2023-03-24drm/i915/perf: Drop wakeref on GuC RC errorChris Wilson
If we fail to adjust the GuC run-control on opening the perf stream, make sure we unwind the wakeref just taken. v2: Retain old goto label names (Ashutosh) v3: Drop bitfield boolean Fixes: 01e742746785 ("drm/i915/guc: Support OA when Wa_16011777198 is enabled") Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-2-umesh.nerlige.ramappa@intel.com
2022-12-30Merge drm/drm-next into drm-intel-gt-nextRodrigo Vivi
Sync after v6.2-rc1 landed in drm-next. We need to get some dependencies in place before we can merge the fixes series from Gwan-gyeong and Chris. References: https://lore.kernel.org/all/Y6x5JCDnh2rvh4lA@intel.com/ Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-12-17drm/i915/mtl: Add OA support by enabling 32 bit OAG formats for MTLUmesh Nerlige Ramappa
Without an entry in oa_init_supported_formats, OA will not be functional in MTL. Enable OA support by enabling 32 bit OAG formats for MTL. Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20228 Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221212220902.1819159-5-umesh.nerlige.ramappa@intel.com
2022-12-17drm/i915/mtl: Update OA mux whitelist for MTLUmesh Nerlige Ramappa
0x20cc (WAIT_FOR_RC6_EXIT on other platforms) is repurposed on MTL. Use a separate mux table to verify oa configs passed by user. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221212220902.1819159-4-umesh.nerlige.ramappa@intel.com
2022-12-17drm/i915/mtl: Add Wa_14015846243 to fix OA vs CS timestamp mismatchUmesh Nerlige Ramappa
Similar to ACM, OA timestamp that is part of the OA report is shifted when compared to the CS timestamp. Add MTL to the WA. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221212220902.1819159-3-umesh.nerlige.ramappa@intel.com
2022-12-17drm/i915/mtl: Resize noa_wait BO size to save restore GPR regsUmesh Nerlige Ramappa
On MTL, gt->scratch was using stolen lmem. An MI_SRM to stolen lmem caused a hang that was attributed to saving and restoring the GPR registers used for noa_wait. Add an additional page in noa_wait BO to save/restore GPR registers for the noa_wait logic. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221212220902.1819159-2-umesh.nerlige.ramappa@intel.com
2022-12-09drm/i915/perf: Do not parse context image for HSWUmesh Nerlige Ramappa
An earlier commit introduced a mechanism to parse the context image to find the OA context control offset. This resulted in an NPD on haswell when gem_context was passed into i915_perf_open_ioctl params. Haswell does not support logical ring contexts, so ensure that the context image is parsed only for platforms with logical ring contexts and also validate lrc_reg_state. v2: Fix build failure v3: Fix checkpatch error Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7432 Fixes: a5c3a3cbf029 ("drm/i915/perf: Determine gen12 oa ctx offset at runtime") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221123235342.713068-1-umesh.nerlige.ramappa@intel.com (cherry picked from commit 95c713d722017b26e301303713d638e0b95b1f68) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-12-07drm/i915/perf: Do not parse context image for HSWUmesh Nerlige Ramappa
An earlier commit introduced a mechanism to parse the context image to find the OA context control offset. This resulted in an NPD on haswell when gem_context was passed into i915_perf_open_ioctl params. Haswell does not support logical ring contexts, so ensure that the context image is parsed only for platforms with logical ring contexts and also validate lrc_reg_state. v2: Fix build failure v3: Fix checkpatch error Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7432 Fixes: a5c3a3cbf029 ("drm/i915/perf: Determine gen12 oa ctx offset at runtime") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221123235342.713068-1-umesh.nerlige.ramappa@intel.com
2022-12-06drm/i915: Wrap all access to i915_vma.node.start|sizeChris Wilson
We already wrap i915_vma.node.start for use with the GGTT, as there we can perform additional sanity checks that the node belongs to the GGTT and fits within the 32b registers. In the next couple of patches, we will introduce guard pages around the objects _inside_ the drm_mm_node allocation. That is we will offset the vma->pages so that the first page is at drm_mm_node.start + vma->guard (not 0 as is currently the case). All users must then not use i915_vma.node.start directly, but compute the guard offset, thus all users are converted to use a i915_vma_offset() wrapper. The notable exceptions are the selftests that are testing exact behaviour of i915_vma_pin/i915_vma_insert. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221130235805.221010-3-andi.shyti@linux.intel.com
2022-11-23Merge tag 'drm-intel-next-2022-11-18' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next GVT Changes: - gvt-next stuff mostly with refactor for the new MDEV interface. i915 Changes: - PSR fixes and improvements (Jouni) - DP DSC fixes (Vinod, Jouni) - More general display cleanups (Jani) - More display collor management cleanup targetting degamma (Ville) - remove circ_buf.h includes (Jiri) - wait power off delay at driver remove to optimize probe (Jani) - More audio cleanup targeting the ELD precompute readout (Ville) - Enable DC power states on all eDP ports (Imre) - RPL-P stepping info (Matt Atwood) - MTL enabling patches (RK) - Removal of DG2 force_probe (Matt) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Y3f71obyEkImXoUF@intel.com
2022-11-16drm/i915: call i915_request_await_object from _i915_vma_move_to_activeAndrzej Hajda
Since almost all calls to i915_vma_move_to_active are prepended with i915_request_await_object, let's call the latter from _i915_vma_move_to_active by default and add flag allowing bypassing it. Adjust all callers accordingly. The patch should not introduce functional changes. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221019215906.295296-2-andrzej.hajda@intel.com
2022-11-14Merge drm/drm-next into drm-intel-nextRodrigo Vivi
Catch up on 6.1-rc cycle in order to solve the intel_backlight conflict on linux-next. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-11-11drm/i915: stop including i915_irq.h from i915_trace.hJani Nikula
Turns out many of the files that need i915_reg.h get it implicitly via {display/intel_de.h, gt/intel_context.h} -> i915_trace.h -> i915_irq.h -> i915_reg.h. Since i915_trace.h doesn't actually need i915_irq.h, makes sense to drop it, but that requires adding quite a few new includes all over the place. Prefer including i915_reg.h where needed instead of adding another implicit include, because eventually we'll want to split up i915_reg.h and only include the specific registers at each place. Also some places actually needed i915_irq.h too. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/6e78a2e0ac1bffaf5af3b5ccc21dff05e6518cef.1668008071.git.jani.nikula@intel.com
2022-11-10drm/i915: Partial abandonment of legacy DRM logging macrosTvrtko Ursulin
Convert some usages of legacy DRM logging macros into versions which tell us on which device have the events occurred. v2: * Don't have struct drm_device as local. (Jani, Ville) v3: * Store gt, not i915, in workaround list. (John) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221109104633.2579245-1-tvrtko.ursulin@linux.intel.com
2022-10-27drm/i915/perf: Enable OA for DG2Umesh Nerlige Ramappa
OA was disabled for DG2 as support was missing. Enable it back now. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-17-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: complete programming whitelisting for XEHPSDVLionel Landwerlin
We have an additional register to select which slices contribute to OAG/OAG counter increments. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-16-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/guc: Support OA when Wa_16011777198 is enabledVinay Belgaumkar
On DG2, a w/a resets RCS/CCS before it goes into RC6. This breaks OA since OA does not expect engine resets during its use. Fix it by disabling RC6. v2: (Ashutosh) - Bring back slpc_unset_param helper - Update commit msg - Use with_intel_runtime_pm helper for set/unset v3: (Ashutosh) - Just use intel_uc_uses_guc_rc Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-15-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Apply Wa_18013179988Umesh Nerlige Ramappa
OA reports in the OA buffer contain an OA timestamp field that helps user calculate delta between 2 OA reports. The calculation relies on the CS timestamp frequency to convert the timestamp value to nanoseconds. The CS timestamp frequency is a function of the CTC_SHIFT value in RPM_CONFIG0. In DG2, OA unit assumes that the CTC_SHIFT is 3, instead of using the actual value from RPM_CONFIG0. At the user level, this results in an error in calculating delta between 2 OA reports since the OA timestamp is not shifted in the same manner as CS timestamp. Also the periodicity of the reports is different from what the user configured because of mismatch in the CS and OA frequencies. The issue also affects MI_REPORT_PERF_COUNT command. To resolve this, return actual OA timestamp frequency to the user in i915_getparam_ioctl, so that user can calculate the right OA exponent as well as interpret the reports correctly. MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893 v2: - Use REG_FIELD_GET (Ashutosh) - Update commit msg Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-13-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Add Wa_1508761755:dg2Umesh Nerlige Ramappa
Disable Clock gating in EU when gathering the events so that EU events are not lost. v2: Fix checkpatch issues v3: User MCR helpers to write to MC reg v4: Indent correctly (checkpatch) Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-12-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Store a pointer to oa_format in oa_bufferUmesh Nerlige Ramappa
DG2 introduces OA reports with 64 bit report header fields. Perf OA would need more information about the OA format in order to process such reports. Store all OA format info in oa_buffer instead of just the size and format-id. v2: Drop format_size variable (Ashutosh) Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-11-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Use gt-specific ggtt for OA and noa-wait buffersUmesh Nerlige Ramappa
User passes uabi engine class and instance to the perf OA interface. Use gt corresponding to the engine to pin the buffers to the right ggtt. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-10-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Replace gt->perf.lock with stream->lock for file opsUmesh Nerlige Ramappa
With multi-gt, user can access multiple OA buffers concurrently. Use stream->lock instead of gt->perf.lock to serialize file operations. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-9-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Move gt-specific data from i915->perf to gt->perfUmesh Nerlige Ramappa
Make perf part of gt as the OAG buffer is specific to a gt. The refactor eventually simplifies programming the right OA buffer and the right HW registers when supporting multiple gts. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-8-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Simply use stream->ctxUmesh Nerlige Ramappa
Earlier code used exclusive_stream to check for user passed context. Simplify this by accessing stream->ctx. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-7-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Enable bytes per clock reporting in OAUmesh Nerlige Ramappa
XEHPSDV and DG2 provide a way to configure bytes per clock vs commands per clock reporting. Enable bytes per clock setting on enabling OA. Bspec: 51762 Bspec: 52201 v2: - Fix commit msg (Ashutosh) - Fix checkpatch issues v3: - s/commands/bytes/ in code comment and commmit msg Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-6-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Determine gen12 oa ctx offset at runtimeUmesh Nerlige Ramappa
Some SKUs of same gen12 platform may have different oactxctrl offsets. For gen12, determine oactxctrl offsets at runtime. v2: (Lionel) - Move MI definitions to intel_gpu_commands.h - Ensure __find_reg_in_lri does read past context image size v3: (Ashutosh) - Drop unnecessary use of double underscores - fix find_reg_in_lri - Return error if oa context offset is U32_MAX - Error out if oa_ctx_ctrl_offset does not find offset v4: (Ashutosh) - Warn on odd MI LRI_LEN - Remove unnecessary check for valid_oactxctrl_offset - Drop valid_oactxctrl_offset macro v5: Drop unrelated comment Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-5-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Fix noa wait predication for DG2Umesh Nerlige Ramappa
Predication for batch buffer commands changed in XEHPSDV. MI_BATCH_BUFFER_START predicates based on MI_SET_PREDICATE_RESULT register. The MI_SET_PREDICATE_RESULT register can only be modified with MI_SET_PREDICATE command. When configured, the MI_SET_PREDICATE command sets MI_SET_PREDICATE_RESULT based on bit 0 of MI_PREDICATE_RESULT_2. Use this to configure predication in noa_wait. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-4-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Add 32-bit OAG and OAR formats for DG2Umesh Nerlige Ramappa
Add new OA formats for DG2. MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893 v2: - Update commit title (Ashutosh) - Coding style fixes (Lionel) - 64 bit OA formats need UMD changes in GPUvis, drop for now and send in a separate series with UMD changes v3: - Update commit message to drop 64 bit related description Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #1 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-3-umesh.nerlige.ramappa@intel.com
2022-10-27drm/i915/perf: Fix OA filtering logic for GuC modeUmesh Nerlige Ramappa
With GuC mode of submission, GuC is in control of defining the context id field that is part of the OA reports. To filter reports, UMD and KMD must know what sw context id was chosen by GuC. There is not interface between KMD and GuC to determine this, so read the upper-dword of EXECLIST_STATUS to filter/squash OA reports for the specific context. v2: Explain guc id stealing w.r.t OA use case Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-2-umesh.nerlige.ramappa@intel.com
2022-10-10drm/i915/perf: remove redundant variable 'taken'Colin Ian King
The assignment to variable taken is redundant and so it can be removed as well as the variable too. Cleans up clang-scan build warnings: warning: Although the value stored to 'taken' is used in the enclosing expression, the value is never actually read from 'taken' [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221007195345.2749911-1-colin.i.king@gmail.com
2022-08-31drm/i915/perf: replace BUG_ON() with WARN_ON()Jani Nikula
Avoid BUG_ON(). Replace with WARN_ON() and early return. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220830093411.1511040-4-jani.nikula@intel.com
2022-07-08i915/perf: Disable OA sseu config param for gfx12.50+Umesh Nerlige Ramappa
The global sseu config is applicable only to gen11 platforms where concurrent media, render and OA use cases may cause some subslices to be turned off and hence lose NOA configuration. Ideally we want to return ENODEV for non-gen11 platforms, however, this has shipped with gfx12, so disable only for gfx12.50+. v2: gfx12 is already shipped with this, disable for gfx12.50+ (Lionel) v3: (Matt) - Update commit message and replace "12.5" with "12.50" - Replace DRM_DEBUG() with driver specific drm_dbg() Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220707193002.2859653-2-umesh.nerlige.ramappa@intel.com
2022-07-08i915/perf: Replace DRM_DEBUG with driver specific drm_dbg callUmesh Nerlige Ramappa
DRM_DEBUG is not the right debug call to use in i915 OA, replace it with driver specific drm_dbg() call (Matt). Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220707193002.2859653-1-umesh.nerlige.ramappa@intel.com
2022-05-13drm/i915: Fix CFI violation with show_dynamic_id()Nathan Chancellor
When an attribute group is created with sysfs_create_group(), the ->sysfs_ops() callback is set to kobj_sysfs_ops, which sets the ->show() callback to kobj_attr_show(). kobj_attr_show() uses container_of() to get the ->show() callback from the attribute it was passed, meaning the ->show() callback needs to be the same type as the ->show() callback in 'struct kobj_attribute'. However, show_dynamic_id() has the type of the ->show() callback in 'struct device_attribute', which causes a CFI violation when opening the 'id' sysfs node under drm/card0/metrics. This happens to work because the layout of 'struct kobj_attribute' and 'struct device_attribute' are the same, so the container_of() cast happens to allow the ->show() callback to still work. Change the type of show_dynamic_id() to match the ->show() callback in 'struct kobj_attributes' and update the type of sysfs_metric_id to match, which resolves the CFI violation. Fixes: f89823c21224 ("drm/i915/perf: Implement I915_PERF_ADD/REMOVE_CONFIG interface") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220513075136.1027007-1-tvrtko.ursulin@linux.intel.com
2022-02-25Merge drm/drm-next into drm-intel-gt-nextTvrtko Ursulin
Matt needed some buddy allocator changes for landing DG2 small BAR support patches. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>