summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-09-11Merge tag 'amd-drm-next-6.12-2024-09-06' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.12-2024-09-06: amdgpu: - IPS updates - Post divider fix - DML2 updates - Misc static checker fixes - DCN 3.5 fixes - Replay fixes - DMCUB updates - SWSMU fixes - DP MST fixes - Add debug flag for per queue resets - devcoredump updates - SR-IOV fixes - MES fixes - Always allocate cleared VRAM for GEM - Pipe reset for GC 9.4.3 - ODM policy fixes - Per queue reset support for GC 10 - Per queue reset support for GC 11 - Per queue reset support for GC 12 - Display flickering fixes - MPO fixes - Display sharpening updates amdkfd: - SVM fix for IH for APUs Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240906211008.3072097-1-alexander.deucher@amd.com
2024-09-11Merge tag 'drm-intel-gt-next-2024-09-06' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next Driver Changes: - Expose fan speed via hwmon (Raag) - Correction to Wa_14019159160 on ARL (John H) - Whitelist COMMON_SLICE_CHICKEN1 for UMD access on DG2/MTL/ARL (Dnyaneshwar) - Do not attempt to load the GSC multiple times to avoid hanging GSC HW (Daniele) - Populate /sys/class/drm/cardX/engines/ even if one engine fails (Andi) - Use kmemdup_array instead of kmemdup for multiple allocation (Yu) - Remove extra unlikely() (Hongbo) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Ztrfr_Wuurfa-3Rv@jlahtine-mobl.ger.corp.intel.com
2024-09-10drm/amdgpu: get rid of bogus includes of fdtable.hAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-10drm/amdkfd: CRIU fixesAl Viro
Instead of trying to use close_fd() on failure exits, just have criu_get_prime_handle() store the file reference without inserting it into descriptor table. Then, once the callers are past the last failure exit, they can go and either insert all those file references into the corresponding slots of descriptor table, or drop all those file references and free the unused descriptors. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-10drm/amdgpu: fix a race in kfd_mem_export_dmabuf()Al Viro
Using drm_gem_prime_handle_to_fd() to set dmabuf up and insert it into descriptor table, only to have it looked up by file descriptor and remove it from descriptor table is not just too convoluted - it's racy; another thread might have modified the descriptor table while we'd been going through that song and dance. Switch kfd_mem_export_dmabuf() to using drm_gem_prime_handle_to_dmabuf() and leave the descriptor table alone... Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-10drm: new helper: drm_gem_prime_handle_to_dmabuf()Al Viro
Once something had been put into descriptor table, the only thing you can do with it is returning descriptor to userland - you can't withdraw it on subsequent failure exit, etc. You certainly can't count upon it staying in the same slot of descriptor table - another thread could've played with close(2)/dup2(2)/whatnot. drm_gem_prime_handle_to_fd() creates a dmabuf, allocates a descriptor and attaches dmabuf's file to it (the last two steps are done in dma_buf_fd()). That's nice when all you are going to do is passing a descriptor to userland. If you just need to work with the resulting object or have something else to be done that might fail, drm_gem_prime_handle_to_fd() is racy. The problem is analogous to one with anon_inode_getfd(), and solution is similar to what anon_inode_getfile() provides. Add drm_gem_prime_handle_to_dmabuf() - the "set dmabuf up" parts of drm_gem_prime_handle_to_fd() without the descriptor-related ones. Instead of inserting into descriptor table and returning the file descriptor it just returns the struct file. drm_gem_prime_handle_to_fd() becomes a wrapper for it. Other users will be introduced in the next commit. Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-10drm/amdgpu/atomfirmware: Silence UBSAN warningAlex Deucher
Per the comments, these are variable sized arrays. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3613 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-10drm/amdgpu: Fix kdoc entry in 'amdgpu_vm_cpu_prepare'Srinivasan Shanmugam
This commit updates described non-existent parameters 'resv' and 'sync_mode', and failed to describe the existing 'sync' parameter. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c:50: warning: Function parameter or struct member 'sync' not described in 'amdgpu_vm_cpu_prepare' drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c:50: warning: Excess function parameter 'resv' description in 'amdgpu_vm_cpu_prepare' drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c:50: warning: Excess function parameter 'sync_mode' description in 'amdgpu_vm_cpu_prepare' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-10drm/amd/amdgpu: apply command submission parser for JPEG v1David (Ming Qiang) Wu
Similar to jpeg_v2_dec_ring_parse_cs() but it has different register ranges and a few other registers access. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-10drm/amd/amdgpu: apply command submission parser for JPEG v2+David (Ming Qiang) Wu
This patch extends the same cs parser from JPEG v4.0.3 to other JPEG versions (v2 and above). Rename to more common name as jpeg_v2_dec_ring_parse_cs() from jpeg_v4_0_3_dec_ring_parse_cs(). Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-10drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3Kenneth Feng
fix the pp_dpm_pcie issue on smu v14.0.2/3 as below: 0: 2.5GT/s, x4 250Mhz 1: 8.0GT/s, x4 616Mhz * 2: 8.0GT/s, x4 1143Mhz * the middle level can be removed since it is always skipped on smu v14.0.2/3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-10drm/amd/pm: update the features set on smu v14.0.2/3Kenneth Feng
update the features set on smu v14.0.2/3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-10drm/amdkfd: Fix resource leak in criu restore queueJesse Zhang
To avoid memory leaks, release q_extra_data when exiting the restore queue. v2: Correct the proto (Alex) Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-10drm/amd/display: Do not reset planes based on crtc zpos_changedLeo Li
[Why] drm_normalize_zpos will set the crtc_state->zpos_changed to 1 if any of it's assigned planes changes zpos, or is removed/added from it. To have amdgpu_dm request a plane reset on this is too broad. For example, if only the cursor plane was moved from one crtc to another, the crtc's zpos_changed will be set to true. But that does not mean that the underlying primary plane requires a reset. [How] Narrow it down so that only the plane that has a change in zpos will require a reset. As a future TODO, we can further optimize this by only requiring a reset on z-order change. Z-order is different from z-pos, since a zpos change doesn't necessarily mean the z-ordering changed, and DC should only require a reset if the z-ordering changed. For example, the following zpos update does not change z-ordering: Plane A: zpos 2 -> 3 Plane B: zpos 1 -> 2 => Plane A is still on top of plane B: no reset needed Whereas this one does change z-ordering: Plane A: zpos 2 -> 1 Plane B: zpos 1 -> 2 => Plane A changed from on top, to below plane B: reset needed Fixes: 38e0c3df6dbd ("drm/amd/display: Move PRIMARY plane zpos higher") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3569 Signed-off-by: Leo Li <sunpeng.li@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-10Merge tag 'drm-xe-next-2024-09-05' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Cross-subsystem Changes: - Split dma fence array creation into alloc and arm (Matthew Brost) Driver Changes: - Move kernel_lrc to execlist backend (Ilia) - Fix type width for pcode coommand (Karthik) - Make xe_drm.h include unambiguous (Jani) - Fixes and debug improvements for GSC load (Daniele) - Track resources and VF state by PF (Michal Wajdeczko) - Fix memory leak on error path (Nirmoy) - Cleanup header includes (Matt Roper) - Move pcode logic to tile scope (Matt Roper) - Move hwmon logic to device scope (Matt Roper) - Fix media TLB invalidation (Matthew Brost) - Threshold config fixes for PF (Michal Wajdeczko) - Remove extra "[drm]" from logs (Michal Wajdeczko) - Add missing runtime ref (Rodrigo Vivi) - Fix circular locking on runtime suspend (Rodrigo Vivi) - Fix rpm in TTM swapout path (Thomas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/eirx5vdvoflbbqlrzi5cip6bpu3zjojm2pxseufu3rlq4pp6xv@eytjvhizfyu6
2024-09-08Linux 6.11-rc7v6.11-rc7Linus Torvalds
2024-09-08Merge tag 'timers_urgent_for_v6.11_rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Borislav Petkov: - Remove percpu irq related code in the timer-of initialization routine as it is broken but also unused (Daniel Lezcano) - Fix return -ETIME when delta exceeds INT_MAX and the next event not taking effect sometimes (Jacky Bai) * tag 'timers_urgent_for_v6.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/drivers/imx-tpm: Fix next event not taking effect sometime clocksource/drivers/imx-tpm: Fix return -ETIME when delta exceeds INT_MAX clocksource/drivers/timer-of: Remove percpu irq related code
2024-09-08Merge tag 'perf_urgent_for_v6.11_rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Borislav Petkov: - Fix perf's AUX buffer serialization - Prevent uninitialized struct members in perf's uprobes handling * tag 'perf_urgent_for_v6.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/aux: Fix AUX buffer serialization uprobes: Use kzalloc to allocate xol area
2024-09-08Merge tag 'char-misc-6.11-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small char/misc/other driver fixes for 6.11-rc7. It's nothing huge, just a bunch of small fixes of reported problems, including: - lots of tiny iio driver fixes - nvmem driver fixex - binder UAF bugfix - uio driver crash fix - other small fixes All of these have been in linux-next this week with no reported problems" * tag 'char-misc-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits) VMCI: Fix use-after-free when removing resource in vmci_resource_remove() Drivers: hv: vmbus: Fix rescind handling in uio_hv_generic uio_hv_generic: Fix kernel NULL pointer dereference in hv_uio_rescind misc: keba: Fix sysfs group creation dt-bindings: nvmem: Use soc-nvmem node name instead of nvmem nvmem: Fix return type of devm_nvmem_device_get() in kerneldoc nvmem: u-boot-env: error if NVMEM device is too small misc: fastrpc: Fix double free of 'buf' in error path binder: fix UAF caused by offsets overwrite iio: imu: inv_mpu6050: fix interrupt status read for old buggy chips iio: adc: ad7173: fix GPIO device info iio: adc: ad7124: fix DT configuration parsing iio: adc: ad_sigma_delta: fix irq_flags on irq request iio: adc: ads1119: Fix IRQ flags iio: fix scale application in iio_convert_raw_to_processed_unlocked iio: adc: ad7124: fix config comparison iio: adc: ad7124: fix chip ID mismatch iio: adc: ad7173: Fix incorrect compatible string iio: buffer-dmaengine: fix releasing dma channel on error iio: adc: ad7606: remove frstdata check for serial mode ...
2024-09-08Merge tag 'usb-6.11-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a handful of small USB fixes for 6.11-rc7. Included in here are: - dwc3 driver fixes for two reported problems - two typec ucsi driver fixes - cdns2 controller reset fix All of these have been in linux-next this week with no reported problems" * tag 'usb-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: typec: ucsi: Fix cable registration usb: typec: ucsi: Fix the partner PD revision usb: cdns2: Fix controller reset issue usb: dwc3: core: update LC timer as per USB Spec V3.2 usb: dwc3: Avoid waking up gadget during startxfer
2024-09-07Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A pile of Qualcomm clk driver fixes with two main themes: the alpha PLL driver and shared RCGs, and one fix for the Starfive JH7110 SoC. - The Alpha PLL clk_ops had multiple problems around setting rates. There are a handful of patches here that fix masks and skip enabling the clk from set_rate() when the PLL is disabled. The PLLs are crucial to operation of the system as almost all frequencies in the system are derived from them. - Parking shared RCGs at a slow always on clk at registration time breaks stuff. USB host mode can't handle such a slow frequency and the serial console gets all garbled when the UART clk is handed over to the kernel. There's a few patches that don't use the shared clk_ops for the UART clks and another one to skip parking the USB clk at registration time. - The Starfive PLL driver used for the CPU was busted causing cpufreq to fail because the clk didn't change to a safe parent during set_rate(). The fix is to register a notifier and switch to a safe parent so the PLL can change rate in a glitch free manner" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: qcom: gcc-sc8280xp: don't use parking clk_ops for QUPs clk: starfive: jh7110-sys: Add notifier for PLL0 clock clk: qcom: gcc-sm8650: Don't use shared clk_ops for QUPs clk: qcom: gcc-sm8550: Don't park the USB RCG at registration time clk: qcom: gcc-sm8550: Don't use parking clk_ops for QUPs clk: qcom: gcc-x1e80100: Don't use parking clk_ops for QUPs clk: qcom: ipq9574: Update the alpha PLL type for GPLLs clk: qcom: gcc-x1e80100: Fix USB 0 and 1 PHY GDSC pwrsts flags clk: qcom: clk-alpha-pll: Update set_rate for Zonda PLL clk: qcom: clk-alpha-pll: Fix zonda set_rate failure when PLL is disabled clk: qcom: clk-alpha-pll: Fix the trion pll postdiv set rate API clk: qcom: clk-alpha-pll: Fix the pll post div mask
2024-09-07Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "Single ufs driver fix quirking around another device spec violation" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: ufs-mediatek: Add UFSHCD_QUIRK_BROKEN_LSDBS_CAP
2024-09-07Merge tag 'pinctrl-v6.11-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fix from Linus Walleij: "A single fix for Qualcomm laptops that are affected by missing wakeup IRQs" * tag 'pinctrl-v6.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: qcom: x1e80100: Bypass PDC wakeup parent for now
2024-09-07Merge tag 'drm-msm-next-2024-09-02' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-next Updates for v6.12 DPU: - Fix implement DP/PHY mapping on SC8180X - Enable writeback on SM8150, SC8180X, SM6125, SM6350 DP: - Enable widebus on all relevant chipsets DSI: - Fix PHY programming on SM8350 / SM8450 HDMI: - Add support for HDMI on MSM8998 MDP5: - NULL string fix GPU: - A642L speedbin support - A615 support - A306 support - A621 support - Expand UBWC uapi - A7xx GPU devcoredump fixes - A5xx preemption fixes - cleanups Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGudK7YMiKDhtvYgp=bY64OZZt0UQSkEkSxLo4rLmeVd9g@mail.gmail.com
2024-09-06Merge tag 'linux_kselftest-kunit-fixes-6.11-rc7-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest PullKUnit fix from Shuah Khan: "Fix to a missing function parameter warning found during documentation build in linux-next" * tag 'linux_kselftest-kunit-fixes-6.11-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: Fix missing kerneldoc comment
2024-09-06Merge tag 'pci-v6.11-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Unregister platform devices for child nodes when stopping a PCI device, even if the PCI core has already cleared the OF_POPULATED bit and of_platform_depopulate() doesn't do anything (Bartosz Golaszewski) - Rescan the bus from a separate thread so we don't deadlock when triggering rescan from sysfs (Bartosz Golaszewski) * tag 'pci-v6.11-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI/pwrctl: Rescan bus on a separate thread PCI: Don't rely on of_platform_depopulate() for reused OF-nodes
2024-09-06Merge tag 'v6.11-rc6-cifs-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - fix potential mount hang - fix retry problem in two types of compound operations - important netfs integration fix in SMB1 read paths - fix potential uninitialized zero point of inode - minor patch to improve debugging for potential crediting problems * tag 'v6.11-rc6-cifs-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: netfs, cifs: Improve some debugging bits cifs: Fix SMB1 readv/writev callback in the same way as SMB2/3 cifs: Fix zero_point init on inode initialisation smb: client: fix double put of @cfile in smb2_set_path_size() smb: client: fix double put of @cfile in smb2_rename_path() smb: client: fix hang in wait_for_response() for negproto
2024-09-06KVM: x86: don't fall through case statements without annotationsLinus Torvalds
clang warns on this because it has an unannotated fall-through between cases: arch/x86/kvm/x86.c:4819:2: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough] and while we could annotate it as a fallthrough, the proper fix is to just add the break for this case, instead of falling through to the default case and the break there. gcc also has that warning, but it looks like gcc only warns for the cases where they fall through to "real code", rather than to just a break. Odd. Fixes: d30d9ee94cc0 ("KVM: x86: Only advertise KVM_CAP_READONLY_MEM when supported by VM") Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Tom Dohrmann <erbse.13@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-09-06drm/amdgpu: drop redundant W=1 warnings from MakefileJani Nikula
Since commit a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the subsystem"), most of the extra warnings in the driver Makefile are redundant. Remove them. Note that -Wmissing-declarations and -Wmissing-prototypes are always enabled by default in scripts/Makefile.extrawarn. Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdgpu: revert "use CPU for page table update if SDMA is unavailable"Christian König
That is clearly not something we should do upstream. The SDMA is mandatory for the driver to work correctly. We could do this for emulation and bringup, but in those cases the engineer should probably enabled CPU based updates manually. This reverts commit 62eefd10ac1c7e976bda47ff311bd87cee40ab8d. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdgpu/mes11: Indent an if statmentDan Carpenter
Indent the "break" statement one more tab. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdkfd: Document and define SVM events message macroPhilip Yang
Document how to use SMI system management interface to enable and receive SVM events. Document SVM event triggers. Define SVM events message string format macro that could be used by user mode for sscanf to parse the event. Add it to uAPI header file to make it obvious that is changing uAPI in future. No functional changes. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdkfd: Select reset method for poison handlingHawking Zhang
Driver mode-2 is only supported by relative new smc firmware. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdkfd: fix missed queue reset on queue destroyJonathan Kim
If a queue is being destroyed but causes a HWS hang on removal, the KFD may issue an unnecessary gpu reset if the destroyed queue can be fixed by a queue reset. This is because the queue has been removed from the KFD's queue list prior to the preemption action on destroy so the reset call will fail to match the HQD PQ reset information against the KFD's queue record to do the actual reset. To fix this, deactivate the queue prior to preemption since it's being destroyed anyways and remove the queue from the KFD's queue list after preemption. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdgpu: Surface svm_default_granularity, a RW module parameterRamesh Errabolu
Enables users to update SVM's default granularity, used in buffer migration and handling of recoverable page faults. Param value is set in terms of log(numPages(buffer)), e.g. 9 for a 2 MIB buffer Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdgpu: fix queue reset issue by mmioJesse Zhang
Initialize the queue type before resetting the queue using mmio. Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amd/display: Add kdoc entry for 'program_isharp_1dlut' in ↵Srinivasan Shanmugam
'dpp401_dscl_program_isharp' Added a descriptor for the 'program_isharp_1dlut' parameter, which is a flag used to determine whether to program the isharp 1D LUT. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/../display/dc/dpp/dcn401/dcn401_dpp_dscl.c:963: warning: Function parameter or struct member 'program_isharp_1dlut' not described in 'dpp401_dscl_program_isharp' Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdgpu: Replace 'amdgpu_job_submit_direct' with 'drm_sched_entity' in ↵Srinivasan Shanmugam
cleaner shader This commit replaces the use of amdgpu_job_submit_direct which submits the job to the ring directly, with drm_sched_entity in the cleaner shader job submission process. The change allows the GPU scheduler to manage the cleaner shader job. - The job is then submitted to the GPU using the drm_sched_entity_push_job function, which allows the GPU scheduler to manage the job. This change improves the reliability of the cleaner shader job submission process by leveraging the capabilities of the GPU scheduler. Fixes: d361ad5d2fc0 ("drm/amdgpu: Add sysfs interface for running cleaner shader") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdgpu/: Add missing kdoc entry in amdgpu_vm_handle_fault functionSrinivasan Shanmugam
This commit adds a description for the 'ts' parameter in the amdgpu_vm_handle_fault function's comment block. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2781: warning: Function parameter or struct member 'ts' not described in 'amdgpu_vm_handle_fault' Cc: Xiaogang.Chen <Xiaogang.Chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202408251419.vgZHg3GV-lkp@intel.com/ Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Xiaogang Chen <Xiaogang.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amd/display: fix dccg root clock optimization related hangQili Lu
[Why] enable dpp rcg before we disable dppclk in hw_init cause system hang/reboot [How] we remove dccg rcg related code from init into a separate function and call it after we init pipe Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Qili Lu <qili.lu@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amd/display: Refactor dccg35_get_other_enabled_symclk_feNicholas Susanto
[Why] Function used to check the number of FEs connected to the current BE. This was then used to determine if the symclk could be disabled, if all FEs were disconnected. However, the function would skip over the primary FE and return 0 when the primary FE was still connected. This caused black screens on driver disable with an MST daisy chain hooked up. [How] Refactor the function to correctly return the number of FEs connected to the input BE. Also, rename it for clarity. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Nicholas Susanto <Nicholas.Susanto@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdgpu: Normalize reg offsets on JPEG v4.0.3Lijo Lazar
On VFs and SOCs with GC 9.4.4, VCN RRMT is disabled. Only local register offsets should be used on JPEG v4.0.3 as they cannot handle remote access to other AIDs. Since only local offsets are used, the special write to MCM_ADDR register is no longer needed. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amd/display: Avoid race between dcn35_set_drr() and dc_state_destruct()Tobias Jakobi
dc_state_destruct() nulls the resource context of the DC state. The pipe context passed to dcn35_set_drr() is a member of this resource context. If dc_state_destruct() is called parallel to the IRQ processing (which calls dcn35_set_drr() at some point), we can end up using already nulled function callback fields of struct stream_resource. The logic in dcn35_set_drr() already tries to avoid this, by checking tg against NULL. But if the nulling happens exactly after the NULL check and before the next access, then we get a race. Avoid this by copying tg first to a local variable, and then use this variable for all the operations. This should work, as long as nobody frees the resource pool where the timing generators live. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3142 Fixes: 06ad7e164256 ("drm/amd/display: Destroy DC context while keeping DML and DML2") Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amd/display: Avoid race between dcn10_set_drr() and dc_state_destruct()Tobias Jakobi
dc_state_destruct() nulls the resource context of the DC state. The pipe context passed to dcn10_set_drr() is a member of this resource context. If dc_state_destruct() is called parallel to the IRQ processing (which calls dcn10_set_drr() at some point), we can end up using already nulled function callback fields of struct stream_resource. The logic in dcn10_set_drr() already tries to avoid this, by checking tg against NULL. But if the nulling happens exactly after the NULL check and before the next access, then we get a race. Avoid this by copying tg first to a local variable, and then use this variable for all the operations. This should work, as long as nobody frees the resource pool where the timing generators live. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3142 Fixes: 06ad7e164256 ("drm/amd/display: Destroy DC context while keeping DML and DML2") Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Tested-by: Raoul van Rüschen <raoul.van.rueschen@gmail.com> Tested-by: Christopher Snowhill <chris@kode54.net> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Tested-by: Sefa Eyeoglu <contact@scrumplex.net> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdgpu: use clamp() in amdgpu_vm_adjust_size()Li Zetao
When it needs to get a value within a certain interval, using clamp() makes the code easier to understand than min(max()). Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Li Zetao <lizetao1@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amd: use clamp() in amdgpu_pll_get_fb_ref_div()Li Zetao
When it needs to get a value within a certain interval, using clamp() makes the code easier to understand than min(max()). Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Li Zetao <lizetao1@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdgpu: enable gfxoff quirk on HP 705G4Peng Liu
Enabling gfxoff quirk results in perfectly usable graphical user interface on HP 705G4 DM with R5 2400G. Without the quirk, X server is completely unusable as every few seconds there is gpu reset due to ring gfx timeout. Signed-off-by: Peng Liu <liupeng01@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdgpu: add raven1 gfxoff quirkPeng Liu
Fix screen corruption with openkylin. Link: https://bbs.openkylin.top/t/topic/171497 Signed-off-by: Peng Liu <liupeng01@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amd/display: Fix spelling mistake "recompte" -> "recompute"Colin Ian King
There is a spelling mistake in a DRM_DEBUG_DRIVER message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdkfd: Add cache line size infoDavid Belanger
Populate cache line size info in topology based on information from IP discovery table. Signed-off-by: David Belanger <david.belanger@amd.com> Reviewed-by: Sreekant Somasekharan <Sreekant.Somasekharan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>