linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2024-08-22	drm/xe: Removed unused xe_ggtt_printk	Rodrigo Vivi
	Apparently this was only useful when enabling ggtt support for the very first time and never used again. It is also not useful now that we have the ggtt_dump available through debugfs. Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-21	drm/xe: fixup xe_alloc_pf_queue	Matthew Auld
	kzalloc expects number of bytes, therefore we should convert the number of dw into bytes, otherwise we are likely just accessing beyond the array causing all kinds of carnage. Also fixup the error handling while we are here. v2: - Prefer kcalloc (dim) Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821171917.417386-2-matthew.auld@intel.com
2024-08-21	drm/xe: Invalidate media_gt TLBs in PT code	Matthew Brost
	Testing on LNL has shown media GT's TLBs need to be invalidated via the GuC, update PT code appropriately. v2: - Do dma_fence_get before first call of invalidation_fence_init (Himal) - No need to check for valid chain fence (Himal) Fixes: 3330361543fc ("drm/xe/lnl: Add LNL platform definition") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820161632.987369-1-matthew.brost@intel.com
2024-08-21	drm/xe: Invalidate media_gt TLBs	Matthew Brost
	Testing on LNL has shown media TLBs need to be invalidated via the GuC, update xe_vm_invalidate_vma appropriately. v2: Fix 2 tile case v3: Include missing local change Fixes: 3330361543fc ("drm/xe/lnl: Add LNL platform definition") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820160129.986889-1-matthew.brost@intel.com
2024-08-21	drm/xe: Free job before xe_exec_queue_put	Matthew Brost
	Free job depends on job->vm being valid, the last xe_exec_queue_put can destroy the VM. Prevent UAF by freeing job before xe_exec_queue_put. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820202309.1260755-1-matthew.brost@intel.com
2024-08-20	drm/xe: Drop HW fence pointer to HW fence ctx	Matthew Brost
	The HW fence ctx objects are not ref counted rather tied to the life of an LRC object. HW fences reference the HW fence ctx, HW fences can outlive LRCs thus resulting in UAF. Drop the HW fence pointer to HW fence ctx rather just store what is needed directly in HW fence. v2: - Fix typo in commit (Ashutosh) - Use snprintf (Ashutosh) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240815193522.16008-1-matthew.brost@intel.com
2024-08-20	drm/xe/guc: Bump the G2H queue size to account for page faults	Stuart Summers
	With the increase in the size of the recoverable page fault queue, we want to ensure the initial messages from GuC in the G2H buffer have space while we transfer those out to the actual pf_queue. Bump the G2H queue size to account for this increase in the pf_queue size. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/4c2b6974801bcffd8a010d838c8733fa4092573d.1723862633.git.stuart.summers@intel.com
2024-08-20	drm/xe: Use topology to determine page fault queue size	Stuart Summers
	Currently the page fault queue size is hard coded. However the hardware supports faulting for each EU and each CS. For some applications running on hardware with a large number of EUs and CSs, this can result in an overflow of the page fault queue. Add a small calculation to determine the page fault queue size based on the number of EUs and CSs in the platform as detmined by fuses. Signed-off-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/24d582a3b48c97793b8b6a402f34b4b469471636.1723862633.git.stuart.summers@intel.com
2024-08-20	drm/xe: Fix missing workqueue destroy in xe_gt_pagefault	Stuart Summers
	On driver reload we never free up the memory for the pagefault and access counter workqueues. Add those destroy calls here. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c9a951505271dc3a7aee76de7656679f69c11518.1723862633.git.stuart.summers@intel.com
2024-08-19	drm/xe/lnl: Offload system clear page activity to GPU	Nirmoy Das
	On LNL because of flat CCS, driver creates migrates job to clear CCS meta data. Extend that to also clear system pages using GPU. Inform TTM to allocate pages without __GFP_ZERO to avoid double page clearing by clearing out TTM_TT_FLAG_ZERO_ALLOC flag and set TTM_TT_FLAG_CLEARED_ON_FREE while freeing to skip ttm pool's clear on free as XE now takes care of clearing pages. If a bo is in system placement such as BO created with DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING and there is a cpu map then for such BO gpu clear will be avoided as there is no dma mapping for such BO at that moment to create migration jobs. Tested this patch api_overhead_benchmark_l0 from https://github.com/intel/compute-benchmarks Without the patch: api_overhead_benchmark_l0 --testFilter=UsmMemoryAllocation: UsmMemoryAllocation(api=l0 type=Host size=4KB) 84.206 us UsmMemoryAllocation(api=l0 type=Host size=1GB) 105775.56 us erf tool top 5 entries: 71.44% api_overhead_be [kernel.kallsyms] [k] clear_page_erms 6.34% api_overhead_be [kernel.kallsyms] [k] __pageblock_pfn_to_page 2.24% api_overhead_be [kernel.kallsyms] [k] cpa_flush 2.15% api_overhead_be [kernel.kallsyms] [k] pages_are_mergeable 1.94% api_overhead_be [kernel.kallsyms] [k] find_next_iomem_res With the patch: api_overhead_benchmark_l0 --testFilter=UsmMemoryAllocation: UsmMemoryAllocation(api=l0 type=Host size=4KB) 79.439 us UsmMemoryAllocation(api=l0 type=Host size=1GB) 98677.75 us Perf tool top 5 entries: 11.16% api_overhead_be [kernel.kallsyms] [k] __pageblock_pfn_to_page 7.85% api_overhead_be [kernel.kallsyms] [k] cpa_flush 7.59% api_overhead_be [kernel.kallsyms] [k] find_next_iomem_res 7.24% api_overhead_be [kernel.kallsyms] [k] pages_are_mergeable 5.53% api_overhead_be [kernel.kallsyms] [k] lookup_address_in_pgd_attr Without this patch clear_page_erms() dominates execution time which is also not pipelined with migration jobs. With this patch page clearing will get pipelined with migration job and will free CPU for more work. v2: Handle regression on dgfx(Himal) Update commit message as no ttm API changes needed. v3: Fix Kunit test. v4: handle data leak on cpu mmap(Thomas) v5: s/gpu_page_clear/gpu_page_clear_sys and move setting it to xe_ttm_sys_mgr_init() and other nits (Matt Auld) v6: Disable it when init_on_alloc and/or init_on_free is active(Matt) Use compute-benchmarks as reporter used it to report this allocation latency issue also a proper test application than mime. In v5, the test showed significant reduction in alloc latency but that is not the case any more, I think this was mostly because previous test was done on IFWI which had low mem BW from CPU. Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240816135154.19678-2-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-08-19	drm/ttm: Add a flag to allow drivers to skip clear-on-free	Nirmoy Das
	Add TTM_TT_FLAG_CLEARED_ON_FREE, which DRM drivers can set before releasing backing stores if they want to skip clear-on-free. Cc: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240816135154.19678-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-08-19	drm/xe/oa: Use vma_pages() helper function in xe_oa_mmap()	Thorsten Blum
	Use the vma_pages() helper function and remove the following Coccinelle/coccicheck warning reported by vma_pages.cocci: WARNING: Consider using vma_pages helper on vma Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240819095751.539645-2-thorsten.blum@toblux.com
2024-08-19	drm/xe/display: Make display suspend/resume work on discrete	Maarten Lankhorst
	We should unpin before evicting all memory, and repin after GT resume. This way, we preserve the contents of the framebuffers, and won't hang on resume due to migration engine not being restored yet. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Uma Shankar <uma.shankar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240806105044.596842-3-maarten.lankhorst@linux.intel.com Signed-off-by: Maarten Lankhorst,,, <maarten.lankhorst@linux.intel.com>
2024-08-19	drm/xe/display: Match i915 driver suspend/resume sequences better	Maarten Lankhorst
	Suspend fbdev sooner, and disable user access before suspending to prevent some races. I've noticed this when comparing xe suspend to i915's. Matches the following commits from i915: 24b412b1bfeb ("drm/i915: Disable intel HPD poll after DRM poll init/enable") 1ef28d86bea9 ("drm/i915: Suspend the framebuffer console earlier during system suspend") bd738d859e71 ("drm/i915: Prevent modesets during driver init/shutdown") Thanks to Imre for pointing me to those commits. Driver shutdown is currently missing, but I have some idea how to implement it next. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240806105044.596842-2-maarten.lankhorst@linux.intel.com Signed-off-by: Maarten Lankhorst,,, <maarten.lankhorst@linux.intel.com>
2024-08-19	drm/xe: prevent UAF around preempt fence	Matthew Auld
	The fence lock is part of the queue, therefore in the current design anything locking the fence should then also hold a ref to the queue to prevent the queue from being freed. However, currently it looks like we signal the fence and then drop the queue ref, but if something is waiting on the fence, the waiter is kicked to wake up at some later point, where upon waking up it first grabs the lock before checking the fence state. But if we have already dropped the queue ref, then the lock might already be freed as part of the queue, leading to uaf. To prevent this, move the fence lock into the fence itself so we don't run into lifetime issues. Alternative might be to have device level lock, or only release the queue in the fence release callback, however that might require pushing to another worker to avoid locking issues. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2454 References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2342 References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2020 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240814110129.825847-2-matthew.auld@intel.com
2024-08-19	drm/xe: Remove redundant param from xe_bo_create_user	Nirmoy Das
	BO from xe_bo_create_user() will always be of type, ttm_bo_type_device. So remove that redundant parameter. Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240816102248.25628-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-08-17	drm/xe/device: Remove unused xe_device::usm::num_vm_in_*	Francois Dugast
	Those counters were used to keep track of the numbers VMs in fault mode and in non-fault mode, to determine if the whole device was in fault mode or not. This is no longer needed so remove those variables and their usages. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-12-francois.dugast@intel.com
2024-08-17	drm/xe/vm: Remove restriction that all VMs must be faulting if one is	Francois Dugast
	With this restriction, all VMs on the device must be faulting VMs if there is already one faulting VM, in which case the device is considered in fault mode. This prevents for example an application from running 3D jobs for the compositor while submitting a SVM compute job on the same device. Now that mutual exclusion of faulting LR jobs and dma fence jobs is ensured on the hw engine group, remove this restriction to allow running faulting and non-faulting VMs on the same device. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-11-francois.dugast@intel.com
2024-08-17	drm/xe/exec: Switch hw engine group execution mode upon job submission	Francois Dugast
	If the job about to be submitted is a dma-fence job, update the current execution mode of the hw engine group. This triggers an immediate suspend of the exec queues running faulting long-running jobs. If the job about to be submitted is a long-running job, kick a new worker used to resume the exec queues running faulting long-running jobs once the dma-fence jobs have completed. v2: Kick the resume worker from exec IOCTL, switch to unordered workqueue, destroy it after use (Matt Brost) v3: Do not resume if no exec queue was suspended (Matt Brost) v4: Squash commits (Matt Brost) v5: Do not kick the worker when xe_vm_in_preempt_fence_mode (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-10-francois.dugast@intel.com
2024-08-17	drm/xe/hw_engine_group: Ensure safe transition between execution modes	Francois Dugast
	Provide a way to safely transition execution modes of the hw engine group ahead of the actual execution. When necessary, either wait for running jobs to complete or preempt them, thus ensuring mutual exclusion between execution modes. Unlike a mutex, the rw_semaphore used in this context allows multiple submissions in the same mode. v2: Use lockdep_assert_held_write, add annotations (Matt Brost) v3: Fix kernel doc, remove redundant code (Matt Brost) v4: Now that xe_hw_engine_group_suspend_faulting_lr_jobs can fail, propagate the error to the caller (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-9-francois.dugast@intel.com
2024-08-17	drm/xe/hw_engine_group: Add helper to wait for dma fence jobs	Francois Dugast
	This is a required feature for faulting long running jobs not to be submitted while dma fence jobs are running on the hw engine group. v2: Switch to lockdep_assert_held_write in worker, get a proper reference for the last fence (Matt Brost) v3: Directly call dma_fence_put with the fence ref (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-8-francois.dugast@intel.com
2024-08-17	drm/xe/exec_queue: Prepare last fence for hw engine group resume context	Francois Dugast
	Ensure we can safely take a ref of the exec queue's last fence from the context of resuming jobs from the hw engine group. The locking requirements differ from the general case, hence the introduction of this new function. v2: Add kernel doc, rework the code to prevent code duplication v3: Fix kernel doc, remove now unnecessary lockdep variants (Matt Brost) v4: Remove new put function (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-7-francois.dugast@intel.com
2024-08-17	drm/xe/exec_queue: Remove duplicated code	Francois Dugast
	This code section is the same as the body of xe_exec_queue_last_fence_put_unlocked() so call the function instead and remove duplicated code to make maintenance easier. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-6-francois.dugast@intel.com
2024-08-17	drm/xe/hw_engine_group: Add helper to suspend faulting LR jobs	Francois Dugast
	This is a required feature for dma fence jobs to preempt faulting long running jobs in order to ensure mutual exclusion on a given hw engine group. v2: Pipeline calls to suspend(q) and suspend_wait(q) to improve efficiency, switch to lockdep_assert_held_write (Matt Brost) v3: Return error on suspend_wait failure to propagate on the call stack up to IOCTL (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-5-francois.dugast@intel.com
2024-08-17	'drm/xe/hw_engine_group: Register hw engine group's exec queues	Francois Dugast
	Add helpers to safely add and delete the exec queues attached to a hw engine group, and make use them at the time of creation and destruction of the exec queues. Keeping track of them is required to control the execution mode of the hw engine group. v2: Improve error handling and robustness, suspend exec queues created in fault mode if group in dma-fence mode, init queue link (Matt Brost) v3: Delete queue from hw engine group when it is destroyed by the user, also clean up at the time of closing the file in case the user did not destroy the queue v4: Use correct list when checking if empty, do not add the queue if VM is in xe_vm_in_preempt_fence_mode (Matt Brost) v5: Remove unrelated newline, add checks and asserts for group, unwind on suspend failure (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-4-francois.dugast@intel.com
2024-08-17	drm/xe/guc_submit: Make suspend_wait interruptible	Francois Dugast
	Rely on wait_event_interruptible_timeout() to put the process to sleep with TASK_INTERRUPTIBLE. It allows using this function in interruptible context. v2: Propagate error on wait_event_interruptible_timeout (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-3-francois.dugast@intel.com
2024-08-17	drm/xe/hw_engine_group: Introduce xe_hw_engine_group	Francois Dugast
	A xe_hw_engine_group is a group of hw engines. Two hw engines belong to the same xe_hw_engine_group if one hw engine cannot make progress while the other is stuck on a page fault. Typically, hw engines of the same group share some resources such as EUs, but this really depends on the hardware configuration of the platforms. The simple engines partitioning proposed here might be too conservative but is intended to work for existing platforms. It can be optimized later if more sets of independent engines are identified. The hw engine groups are intended to be used in the context of faulting long-running jobs submissions. v2: Move to own files, improve error handling (Matt Brost) v3: Fix build issue reported by CI, improve commit message (Matt Roper) v4: Fix kernel doc v5: Add switch case for XE_ENGINE_CLASS_OTHER Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-2-francois.dugast@intel.com
2024-08-17	drm/xe: Use reserved copy engine for user binds on faulting devices	Matthew Brost
	User binds map to engines with can fault, faults depend on user binds completion, thus we can deadlock. Avoid this by using reserved copy engine for user binds on faulting devices. While we are here, normalize bind queue creation with a helper. v2: - Pass in extensions to bind queue creation (CI) v3: - s/resevered/reserved (Lucas) - Fix NULL hwe check (Jonathan) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240816034033.53837-1-matthew.brost@intel.com
2024-08-16	drm/xe/mcr: Try to derive dss_per_grp from hwconfig attributes	Matt Roper
	When steering MCR register ranges of type "DSS," the group_id and instance_id values are calculated by dividing the DSS pool according to the size of a gslice or cslice, depending on the platform. These values haven't changed much on past platforms, so we've been able to hardcode the proper divisor so far. However the layout may not be so fixed on future platforms so the proper, future-proof way to determine this is by using some of the attributes from the GuC's hwconfig table. The hwconfig has two attributes reflecting the architectural maximum slice and subslice counts (i.e., before any fusing is considered) that can be used for the purposes of calculating MCR steering targets. If the hwconfig is lacking the necessary values (which should only be possible on older platforms before these attributes were added), we can still fall back to the old hardcoded values. Going forward the hwconfig is expected to always provide the information we need on newer platforms, and any failure to do so will be considered a bug in the firmware that will prevent us from switching to the buggy firmware release. It's worth noting that over time GuC's hwconfig has provided a couple different keys with similar-sounding descriptions. For our purposes here, we only trust the newer key "70" which has supplanted the similarly-named key "2" that existed on older platforms. Bspec: 73210 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240815172602.2729146-4-matthew.d.roper@intel.com
2024-08-16	drm/xe: Add debugfs to dump GuC's hwconfig	Matt Roper
	Although the query uapi is the official way to get at the GuC's hwconfig table contents, it's still useful to have a quick debugfs interface to dump the table in a human-readable format while debugging the driver. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240815172602.2729146-3-matthew.d.roper@intel.com
2024-08-16	Merge drm/drm-next into drm-xe-next	Lucas De Marchi
	Get drm-xe-next on v6.11-rc2 and synchronized with drm-intel-next for the display side. This resolves the current conflict for the enable_display module parameter and allows further pending refactors. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-16	drm/xe: Use for_each_remote_tile rather than manual check	Matthew Brost
	Replace for_each_tile plus a check against primary tile with for_each_remote_tile in tiles_fini. The latter macro does this for us. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240816040208.62695-1-matthew.brost@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-16	drm/xe/uc: Use devm to register cleanup that includes exec_queues	Daniele Ceraolo Spurio
	Exec_queue cleanup requires HW access, so we need to use devm instead of drmm for it. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240815230541.3828206-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-16	drm/xe/uc: Use managed bo for HuC and GSC objects	Daniele Ceraolo Spurio
	Drmm actions are not the right ones to clean up BOs and we should use devm instead. However, we can also instead just allocate the objects using the managed_bo function, which will internally register the correct cleanup call and therefore allows us to simplify the code. While at it, switch to drmm_kzalloc for the GSC proxy allocation to further simplify the cleanup. Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240815230541.3828206-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-16	Merge tag 'drm-intel-next-2024-08-13' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/i915/kernel into drm-next - Type-C programming fix for MTL+ (Gustavo) - Fix display clock workaround (Mitul) - Fix DP LTTPR detection (Imre) - Calculate vblank delay more accurately (Ville) - Make vrr_{enabling,disabling}() usable outside intel_display.c (Ville) - FBC clean-up (Ville) - DP link-training fixes and clean-up (Imre) - Make I2C terminology more inclusive (Easwar) - Make read-only array bw_gbps static const (Colin) - HDCP fixes and improvements (Suraj) - DP VSC SDP fixes and clean-ups (Suraj, Mitul) - Fix opregion leak in Xe code (Lucas) - Fix possible int overflow in skl_ddi_calculate_wrpll (Nikita)] - General display clean-ups and conversion towards intel_display (Jani) - On DP MST, Enable LT fallback for UHBR<->non-UHBR rates (Imre) - Add VRR condition for DPKGC Enablement (Suraj) - Use backlight power constants (Zimmermann) - Correct dual pps handling for MTL_PCH+ (Dnyaneshwar) - Dump DSC HW state (Imre) - Replace double blank with single blank after comma (Andi) - Read display register timeout on BMG (Mitul) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZruWsyTv3nzdArDk@intel.com
2024-08-15	drm/xe: Fix tile fini sequence	Matthew Brost
	Only set tile->mmio.regs to NULL if not the root tile in tile_fini. The root tile mmio regs is setup ealier in MMIO init thus it should be set to NULL in mmio_fini. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809232830.3302251-1-matthew.brost@intel.com
2024-08-15	drm/xe: use devm instead of drmm for managed bo	Daniele Ceraolo Spurio
	The BO cleanup touches the GGTT and therefore requires the HW to be available, so we need to use devm instead of drmm. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1160 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809231237.1503796-2-daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-15	drm/xe: Make exec_queue_kill safe to call twice	Daniele Ceraolo Spurio
	An upcoming PXP patch will kill queues at runtime when a PXP invalidation event occurs, so we need exec_queue_kill to be safe to call multiple times. v2: Add documentation (Matt B) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240814205654.1716586-1-daniele.ceraolospurio@intel.com
2024-08-15	drm/xe: fix engine_class bounds check again	Matthew Auld
	This was fixed in commit b7dce525c4fc ("drm/xe/queue: fix engine_class bounds check"), but then re-introduced in commit 6f20fc09936e ("drm/xe: Move and export xe_hw_engine lookup.") which should only be simple code movement of the existing function. Fixes: 6f20fc09936e ("drm/xe: Move and export xe_hw_engine lookup.") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240812141331.729843-2-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-14	drm/xe: Define STATELESS_COMPRESSION_CTRL as mcr register	Tejas Upadhyay
	Register STATELESS_COMPRESSION_CTRL should be considered mcr register which should write to all slices as per documentation. Bspec: 71185 Fixes: ecabb5e6ce54 ("drm/xe/xe2: Add performance turning changes") Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240814095614.909774-4-tejas.upadhyay@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-14	drm/xe: Write all slices if its mcr register	Tejas Upadhyay
	Register GAMREQSTRM_CTRL should be considered mcr register which should write to all slices as per documentation. Bspec: 71185 Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340") Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240814095614.909774-3-tejas.upadhyay@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-14	drm/xe: Move enable host l2 VRAM post MCR init	Tejas Upadhyay
	xe_gt_enable_host_l2_vram() is reading the XE2_GAMREQSTRM_CTRL register that is currently missing the MCR annotation. However, just adding the annotation doesn't work as this function is called before MCR handling is initialized in xe_gt_mcr_init(). xe_gt_enable_host_l2_vram() is used to implement WA 16023588340 that needs to be done as early as possible during initialization in order to be effective since the MMIO writes impact it. In the failure scenario, driver would simply not be able to bind successfully. Moving xe_gt_enable_host_l2_vram() later, after MCR initialization is done, only incurs a few additional HW accesses, particularly when loading GuC for hwconfig. Binding/unbinding the driver 100 times in BMG still works so it should be ok to start handling the WA a little bit later. This is sufficient to allow adding the MCR annotation to XE2_GAMREQSTRM_CTRL. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240814095614.909774-2-tejas.upadhyay@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-13	drm/xe: Rename enable_display module param	Lucas De Marchi
	The different approach used by xe regarding the initialization of display HW has been proved a great addition for early driver bring up: core xe can be tested without having all the bits sorted out on the display side. On the other hand, the approach exposed by i915-display is to actively disable the display by programming it if needed, i.e. if it was left enabled by firmware. It also has its use to make sure the HW is actually disabled and not wasting power. However having both the way it is in xe doesn't expose a good interface wrt module params. From modinfo: disable_display:Disable display (default: false) (bool) enable_display:Enable display (bool) Rename enable_display to probe_display to try to convey the message that the HW is being touched and improve the module param description. To avoid confusion, the enable_display is renamed everywhere, not only in the module param. New description for the parameters: disable_display:Disable display (default: false) (bool) probe_display:Probe display HW, otherwise it's left untouched (default: true) (bool) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240813141931.3141395-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-13	drm/xe: Remove unused xe parameter	Himal Prasad Ghimiray
	Remove the xe parameter from the pde_encode_pat_index and pte_encode_pat_index functions, as it is no longer used. Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240813104419.2958046-1-himal.prasad.ghimiray@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-13	drm/xe: Name and document Wa_14019789679	Matt Roper
	Early in the development of Xe we identified an issue with SVG state handling on DG2 and MTL (and later on Xe2 as well). In commit 72ac304769dd ("drm/xe: Emit SVG state on RCS during driver load on DG2 and MTL") and commit fb24b858a20d ("drm/xe/xe2: Update SVG state handling") we implemented our own workaround to prevent SVG state from leaking from context A to context B in cases where context B never issues a specific state setting. The hardware teams have now created official workaround Wa_14019789679 to cover this issue. The workaround description only requires emitting 3DSTATE_MESH_CONTROL, since they believe that's the only SVG instruction that would potentially remain unset by a context B, but still cause notable issues if unwanted values were inherited from context A. However since we already have a more extensive implementation that emits the entire SVG state and prevents _any_ SVG state from unintentionally leaking, we'll stick with our existing implementation just to be safe. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240812181042.2013508-2-matthew.d.roper@intel.com
2024-08-13	drm/xe: add kdev_to_xe_device() helper and use it	Jani Nikula
	There are enough users for kernel device to xe device conversion, add a helper for it. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/38c80846e70c7e410850530426384e17cff9d031.1723458544.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-08-13	drm/xe: use pdev_to_xe_device() instead of pci_get_drvdata() directly	Jani Nikula
	We have a helper for converting pci device to xe device, use it. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1b87c2e56200e001ce3a5d2f4a93eb26b294df32.1723458544.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-08-13	drm/xe/tests: remove unused leftover xe_call_for_each_device()	Jani Nikula
	xe_call_for_each_device() has been unused since commit 57ecead343e7 ("drm/xe/tests: Convert xe_mocs live tests"). Remove it and the related dev_to_xe_device_fn() and struct kunit_test_data. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/fa3bb23d005313c9797f557e1211fde09fcb59cc.1723458544.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-08-13	drm/xe/migrate: Parameterize ccs and bo data clear in xe_migrate_clear()	Nirmoy Das
	Parameterize clearing ccs and bo data in xe_migrate_clear() which higher layers can utilize. This patch will be used later on when doing bo data clear for igfx as well. v2: Replace multiple params with flags in xe_migrate_clear (Matt B) v3: s/CLEAR_BO_DATA_FLAG_/XE_MIGRATE_CLEAR_FLAG_ and move to xe_migrate.h. other nits(Matt B) Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809220347.25330-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-08-13	drm/i915: use pdev_to_i915() instead of pci_get_drvdata() directly	Jani Nikula
	We have a helper for converting pci device to i915 device, use it. v2: Also convert i915_pci_probe() (Gustavo) Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240812103415.1540096-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>