summaryrefslogtreecommitdiff
path: root/drivers/iommu/arm
AgeCommit message (Collapse)Author
3 daysMerge tag 'for-linus-iommufd' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd updates from Jason Gunthorpe: "This broadly brings the assigned HW command queue support to iommufd. This feature is used to improve SVA performance in VMs by avoiding paravirtualization traps during SVA invalidations. Along the way I think some of the core logic is in a much better state to support future driver backed features. Summary: - IOMMU HW now has features to directly assign HW command queues to a guest VM. In this mode the command queue operates on a limited set of invalidation commands that are suitable for improving guest invalidation performance and easy for the HW to virtualize. This brings the generic infrastructure to allow IOMMU drivers to expose such command queues through the iommufd uAPI, mmap the doorbell pages, and get the guest physical range for the command queue ring itself. - An implementation for the NVIDIA SMMUv3 extension "cmdqv" is built on the new iommufd command queue features. It works with the existing SMMU driver support for cmdqv in guest VMs. - Many precursor cleanups and improvements to support the above cleanly, changes to the general ioctl and object helpers, driver support for VDEVICE, and mmap pgoff cookie infrastructure. - Sequence VDEVICE destruction to always happen before VFIO device destruction. When using the above type features, and also in future confidential compute, the internal virtual device representation becomes linked to HW or CC TSM configuration and objects. If a VFIO device is removed from iommufd those HW objects should also be cleaned up to prevent a sort of UAF. This became important now that we have HW backing the VDEVICE. - Fix one syzkaller found error related to math overflows during iova allocation" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (57 commits) iommu/arm-smmu-v3: Replace vsmmu_size/type with get_viommu_size iommu/arm-smmu-v3: Do not bother impl_ops if IOMMU_VIOMMU_TYPE_ARM_SMMUV3 iommufd: Rename some shortterm-related identifiers iommufd/selftest: Add coverage for vdevice tombstone iommufd/selftest: Explicitly skip tests for inapplicable variant iommufd/vdevice: Remove struct device reference from struct vdevice iommufd: Destroy vdevice on idevice destroy iommufd: Add a pre_destroy() op for objects iommufd: Add iommufd_object_tombstone_user() helper iommufd/viommu: Roll back to use iommufd_object_alloc() for vdevice iommufd/selftest: Test reserved regions near ULONG_MAX iommufd: Prevent ALIGN() overflow iommu/tegra241-cmdqv: import IOMMUFD module namespace iommufd: Do not allow _iommufd_object_alloc_ucmd if abort op is set iommu/tegra241-cmdqv: Add IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV support iommu/tegra241-cmdqv: Add user-space use support iommu/tegra241-cmdqv: Do not statically map LVCMDQs iommu/tegra241-cmdqv: Simplify deinit flow in tegra241_cmdqv_remove_vintf() iommu/tegra241-cmdqv: Use request_threaded_irq iommu/arm-smmu-v3-iommufd: Add hw_info to impl_ops ...
5 daysMerge tag 'iommu-updates-v6.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Will Deacon: "Core: - Remove the 'pgsize_bitmap' member from 'struct iommu_ops' - Convert the x86 drivers over to msi_create_parent_irq_domain() AMD-Vi: - Add support for examining driver/device internals via debugfs - Add support for "HATDis" to disable host translation when it is not supported - Add support for limiting the maximum host translation level based on EFR[HATS] Apple DART: - Don't enable as built-in by default when ARCH_APPLE is selected Arm SMMU: - Devicetree bindings update for the Qualcomm SMMU in the "Milos" SoC - Support for Qualcomm SM6115 MDSS parts - Disable PRR on Qualcomm SM8250 as using these bits causes the hypervisor to explode Intel VT-d: - Reorganize Intel VT-d to be ready for iommupt - Optimize iotlb_sync_map for non-caching/non-RWBF modes - Fix missed PASID in dev TLB invalidation in cache_tag_flush_all() Mediatek: - Fix build warnings when W=1 Samsung Exynos: - Add support for reserved memory regions specified by the bootloader TI OMAP: - Use syscon_regmap_lookup_by_phandle_args() instead of parsing the node manually Misc: - Cleanups and minor fixes across the board" * tag 'iommu-updates-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (48 commits) iommu/vt-d: Fix UAF on sva unbind with pending IOPFs iommu/vt-d: Make iotlb_sync_map a static property of dmar_domain dt-bindings: arm-smmu: Remove sdm845-cheza specific entry iommu/amd: Fix geometry.aperture_end for V2 tables iommu/amd: Wrap debugfs ABI testing symbols snippets in literal code blocks iommu/amd: Add documentation for AMD IOMMU debugfs support iommu/amd: Add debugfs support to dump IRT Table iommu/amd: Add debugfs support to dump device table iommu/amd: Add support for device id user input iommu/amd: Add debugfs support to dump IOMMU command buffer iommu/amd: Add debugfs support to dump IOMMU Capability registers iommu/amd: Add debugfs support to dump IOMMU MMIO registers iommu/amd: Refactor AMD IOMMU debugfs initial setup dt-bindings: arm-smmu: document the support on Milos iommu/exynos: add support for reserved regions iommu/arm-smmu: disable PRR on SM8250 iommu/arm-smmu-v3: Revert vmaster in the error path iommu/io-pgtable-arm: Remove unused macro iopte_prot iommu/arm-smmu-qcom: Add SM6115 MDSS compatible iommu/qcom: Fix pgsize_bitmap ...
7 daysiommu/arm-smmu-v3: Replace vsmmu_size/type with get_viommu_sizeNicolin Chen
It's more flexible to have a get_viommu_size op. Replace static vsmmu_size and vsmmu_type with that. Link: https://patch.msgid.link/r/20250724221002.1883034-3-nicolinc@nvidia.com Suggested-by: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
7 daysiommu/arm-smmu-v3: Do not bother impl_ops if IOMMU_VIOMMU_TYPE_ARM_SMMUV3Nicolin Chen
When viommu type is IOMMU_VIOMMU_TYPE_ARM_SMMUV3, always return or init the standard struct arm_vsmmu, instead of going through impl_ops that must have its own viommu type than the standard IOMMU_VIOMMU_TYPE_ARM_SMMUV3. Given that arm_vsmmu_init() is called after arm_smmu_get_viommu_size(), any unsupported viommu->type must be a corruption. And it must be a driver bug that its vsmmu_size and vsmmu_init ops aren't paired. Warn these two cases. Link: https://patch.msgid.link/r/20250724221002.1883034-2-nicolinc@nvidia.com Suggested-by: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
11 daysMerge branch 'arm/smmu/updates' into nextWill Deacon
* arm/smmu/updates: iommu/arm-smmu: disable PRR on SM8250 iommu/arm-smmu-v3: Revert vmaster in the error path iommu/io-pgtable-arm: Remove unused macro iopte_prot
11 daysMerge branch 'arm/smmu/bindings' into nextWill Deacon
* arm/smmu/bindings: dt-bindings: arm-smmu: Remove sdm845-cheza specific entry dt-bindings: arm-smmu: document the support on Milos iommu/arm-smmu-qcom: Add SM6115 MDSS compatible
2025-07-18iommufd/vdevice: Remove struct device reference from struct vdeviceXu Yilun
Remove struct device *dev from struct vdevice. The dev pointer is the Plan B for vdevice to reference the physical device. As now vdev->idev is added without refcounting concern, just use vdev->idev->dev when needed. To avoid exposing struct iommufd_device in the public header, export a iommufd_vdevice_to_device() helper. Link: https://patch.msgid.link/r/20250716070349.1807226-6-yilun.xu@linux.intel.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Co-developed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-14iommu/tegra241-cmdqv: import IOMMUFD module namespaceArnd Bergmann
The tegra variant of smmu-v3 now uses the iommufd mmap interface but is missing the corresponding import: ERROR: modpost: module arm_smmu_v3 uses symbol _iommufd_object_depend from namespace IOMMUFD, but does not import it. ERROR: modpost: module arm_smmu_v3 uses symbol iommufd_viommu_report_event from namespace IOMMUFD, but does not import it. ERROR: modpost: module arm_smmu_v3 uses symbol _iommufd_destroy_mmap from namespace IOMMUFD, but does not import it. ERROR: modpost: module arm_smmu_v3 uses symbol _iommufd_object_undepend from namespace IOMMUFD, but does not import it. ERROR: modpost: module arm_smmu_v3 uses symbol _iommufd_alloc_mmap from namespace IOMMUFD, but does not import it. Fixes: b135de24cfc0 ("iommu/tegra241-cmdqv: Add user-space use support") Link: https://patch.msgid.link/r/20250714205747.3475772-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-14iommu/arm-smmu: disable PRR on SM8250Dmitry Baryshkov
On SM8250 / QRB5165-RB5 using PRR bits resets the device, most likely because of the hyp limitations. Disable PRR support on that platform. Fixes: 7f2ef1bfc758 ("iommu/arm-smmu: Add support for PRR bit setup") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Akhil P Oommen <akhilpo@oss.qualcomm.com> Reviewed-by: Rob Clark <robin.clark@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250705-iommu-fix-prr-v2-1-406fecc37cf8@oss.qualcomm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14iommu/arm-smmu-v3: Revert vmaster in the error pathNicolin Chen
The error path for err_free_master_domain leaks the vmaster. Move all the kfrees for vmaster into the goto error section. Fixes: cfea71aea921 ("iommu/arm-smmu-v3: Put iopf enablement in the domain attach path") Cc: stable@vger.kernel.org Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20250711204020.1677884-1-nicolinc@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14iommu/arm-smmu-qcom: Add SM6115 MDSS compatibleAlexey Klimov
Add the SM6115 MDSS compatible to clients compatible list, as it also needs that workaround. Without this workaround, for example, QRB4210 RB2 which is based on SM4250/SM6115 generates a lot of smmu unhandled context faults during boot: arm_smmu_context_fault: 116854 callbacks suppressed arm-smmu c600000.iommu: Unhandled context fault: fsr=0x402, iova=0x5c0ec600, fsynr=0x320021, cbfrsynra=0x420, cb=5 arm-smmu c600000.iommu: FSR = 00000402 [Format=2 TF], SID=0x420 arm-smmu c600000.iommu: FSYNR0 = 00320021 [S1CBNDX=50 PNU PLVL=1] arm-smmu c600000.iommu: Unhandled context fault: fsr=0x402, iova=0x5c0d7800, fsynr=0x320021, cbfrsynra=0x420, cb=5 arm-smmu c600000.iommu: FSR = 00000402 [Format=2 TF], SID=0x420 and also failed initialisation of lontium lt9611uxc, gpu and dpu is observed: (binding MDSS components triggered by lt9611uxc have failed) ------------[ cut here ]------------ !aspace WARNING: CPU: 6 PID: 324 at drivers/gpu/drm/msm/msm_gem_vma.c:130 msm_gem_vma_init+0x150/0x18c [msm] Modules linked in: ... (long list of modules) CPU: 6 UID: 0 PID: 324 Comm: (udev-worker) Not tainted 6.15.0-03037-gaacc73ceeb8b #4 PREEMPT Hardware name: Qualcomm Technologies, Inc. QRB4210 RB2 (DT) pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : msm_gem_vma_init+0x150/0x18c [msm] lr : msm_gem_vma_init+0x150/0x18c [msm] sp : ffff80008144b280 ... Call trace: msm_gem_vma_init+0x150/0x18c [msm] (P) get_vma_locked+0xc0/0x194 [msm] msm_gem_get_and_pin_iova_range+0x4c/0xdc [msm] msm_gem_kernel_new+0x48/0x160 [msm] msm_gpu_init+0x34c/0x53c [msm] adreno_gpu_init+0x1b0/0x2d8 [msm] a6xx_gpu_init+0x1e8/0x9e0 [msm] adreno_bind+0x2b8/0x348 [msm] component_bind_all+0x100/0x230 msm_drm_bind+0x13c/0x3d0 [msm] try_to_bring_up_aggregate_device+0x164/0x1d0 __component_add+0xa4/0x174 component_add+0x14/0x20 dsi_dev_attach+0x20/0x34 [msm] dsi_host_attach+0x58/0x98 [msm] devm_mipi_dsi_attach+0x34/0x90 lt9611uxc_attach_dsi.isra.0+0x94/0x124 [lontium_lt9611uxc] lt9611uxc_probe+0x540/0x5fc [lontium_lt9611uxc] i2c_device_probe+0x148/0x2a8 really_probe+0xbc/0x2c0 __driver_probe_device+0x78/0x120 driver_probe_device+0x3c/0x154 __driver_attach+0x90/0x1a0 bus_for_each_dev+0x68/0xb8 driver_attach+0x24/0x30 bus_add_driver+0xe4/0x208 driver_register+0x68/0x124 i2c_register_driver+0x48/0xcc lt9611uxc_driver_init+0x20/0x1000 [lontium_lt9611uxc] do_one_initcall+0x60/0x1d4 do_init_module+0x54/0x1fc load_module+0x1748/0x1c8c init_module_from_file+0x74/0xa0 __arm64_sys_finit_module+0x130/0x2f8 invoke_syscall+0x48/0x104 el0_svc_common.constprop.0+0xc0/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x2c/0x80 el0t_64_sync_handler+0x10c/0x138 el0t_64_sync+0x198/0x19c ---[ end trace 0000000000000000 ]--- msm_dpu 5e01000.display-controller: [drm:msm_gpu_init [msm]] *ERROR* could not allocate memptrs: -22 msm_dpu 5e01000.display-controller: failed to load adreno gpu platform a400000.remoteproc:glink-edge:apr:service@7:dais: Adding to iommu group 19 msm_dpu 5e01000.display-controller: failed to bind 5900000.gpu (ops a3xx_ops [msm]): -22 msm_dpu 5e01000.display-controller: adev bind failed: -22 lt9611uxc 0-002b: failed to attach dsi to host lt9611uxc 0-002b: probe with driver lt9611uxc failed with error -22 Suggested-by: Bjorn Andersson <andersson@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Fixes: 3581b7062cec ("drm/msm/disp/dpu1: add support for display on SM6115") Cc: stable@vger.kernel.org Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://lore.kernel.org/r/20250613173238.15061-1-alexey.klimov@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14iommu/qcom: Fix pgsize_bitmapJason Gunthorpe
qcom uses the ARM_32_LPAE_S1 format which uses the ARM long descriptor page table. Eventually arm_32_lpae_alloc_pgtable_s1() will adjust the pgsize_bitmap with: cfg->pgsize_bitmap &= (SZ_4K | SZ_2M | SZ_1G); So the current declaration is nonsensical. Fix it to be just SZ_4K which is what it has actually been using so far. Most likely the qcom driver copy and pasted the pgsize_bitmap from something using the ARM_V7S format. Fixes: db64591de4b2 ("iommu/qcom: Remove iommu_ops pgsize_bitmap") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Closes: https://lore.kernel.org/all/CA+G9fYvif6kDDFar5ZK4Dff3XThSrhaZaJundjQYujaJW978yg@mail.gmail.com/ Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/0-v1-65a7964d2545+195-qcom_pgsize_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-11iommu/tegra241-cmdqv: Add IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV supportNicolin Chen
Add a new vEVENTQ type for VINTFs that are assigned to the user space. Simply report the two 64-bit LVCMDQ_ERR_MAPs register values. Link: https://patch.msgid.link/r/68161a980da41fa5022841209638aeff258557b5.1752126748.git.nicolinc@nvidia.com Reviewed-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-11iommu/tegra241-cmdqv: Add user-space use supportNicolin Chen
The CMDQV HW supports a user-space use for virtualization cases. It allows the VM to issue guest-level TLBI or ATC_INV commands directly to the queue and executes them without a VMEXIT, as HW will replace the VMID field in a TLBI command and the SID field in an ATC_INV command with the preset VMID and SID. This is built upon the vIOMMU infrastructure by allowing VMM to allocate a VINTF (as a vIOMMU object) and assign VCMDQs (HW QUEUE objs) to the VINTF. So firstly, replace the standard vSMMU model with the VINTF implementation but reuse the standard cache_invalidate op (for unsupported commands) and the standard alloc_domain_nested op (for standard nested STE). Each VINTF has two 64KB MMIO pages (128B per logical VCMDQ): - Page0 (directly accessed by guest) has all the control and status bits. - Page1 (trapped by VMM) has guest-owned queue memory location/size info. VMM should trap the emulated VINTF0's page1 of the guest VM for the guest- level VCMDQ location/size info and forward that to the kernel to translate to a physical memory location to program the VCMDQ HW during an allocation call. Then, it should mmap the assigned VINTF's page0 to the VINTF0 page0 of the guest VM. This allows the guest OS to read and write the guest-own VINTF's page0 for direct control of the VCMDQ HW. For ATC invalidation commands that hold an SID, it requires all devices to register their virtual SIDs to the SID_MATCH registers and their physical SIDs to the pairing SID_REPLACE registers, so that HW can use those as a lookup table to replace those virtual SIDs with the correct physical SIDs. Thus, implement the driver-allocated vDEVICE op with a tegra241_vintf_sid structure to allocate SID_REPLACE and to program the SIDs accordingly. This enables the HW accelerated feature for NVIDIA Grace CPU. Compared to the standard SMMUv3 operating in the nested translation mode trapping CMDQ for TLBI and ATC_INV commands, this gives a huge performance improvement: 70% to 90% reductions of invalidation time were measured by various DMA unmap tests running in a guest OS. Link: https://patch.msgid.link/r/fb0eab83f529440b6aa181798912a6f0afa21eb0.1752126748.git.nicolinc@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-11iommu/tegra241-cmdqv: Do not statically map LVCMDQsNicolin Chen
To simplify the mappings from global VCMDQs to VINTFs' LVCMDQs, the design chose to do static allocations and mappings in the global reset function. However, with the user-owned VINTF support, it exposes a security concern: if user space VM only wants one LVCMDQ for a VINTF, statically mapping two or more LVCMDQs creates a hidden VCMDQ that user space could DoS attack by writing random stuff to overwhelm the kernel with unhandleable IRQs. Thus, to support the user-owned VINTF feature, a LVCMDQ mapping has to be done dynamically. HW allows pre-assigning global VCMDQs in the CMDQ_ALLOC registers, without finalizing the mappings by keeping CMDQV_CMDQ_ALLOCATED=0. So, add a pair of map/unmap helper that simply sets/clears that bit. For kernel-owned VINTF0, move LVCMDQ mappings to tegra241_vintf_hw_init(), and the unmappings to tegra241_vintf_hw_deinit(). For user-owned VINTFs that will be added, the mappings/unmappings will be on demand upon an LVCMDQ allocation from the user space. However, the dynamic LVCMDQ mapping/unmapping can complicate the timing of calling tegra241_vcmdq_hw_init/deinit(), which write LVCMDQ address space, i.e. requiring LVCMDQ to be mapped. Highlight that with a note to the top of either of them. Link: https://patch.msgid.link/r/be115a8f75537632daf5995b3e583d8a76553fba.1752126748.git.nicolinc@nvidia.com Acked-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-11iommu/tegra241-cmdqv: Simplify deinit flow in tegra241_cmdqv_remove_vintf()Nicolin Chen
The current flow of tegra241_cmdqv_remove_vintf() is: 1. For each LVCMDQ, tegra241_vintf_remove_lvcmdq(): a. Disable the LVCMDQ HW b. Release the LVCMDQ SW resource 2. For current VINTF, tegra241_vintf_hw_deinit(): c. Disable all LVCMDQ HWs d. Disable VINTF HW Obviously, the step 1.a and the step 2.c are redundant. Since tegra241_vintf_hw_deinit() disables all of its LVCMDQ HWs, it could simplify the flow in tegra241_cmdqv_remove_vintf() by calling that first: 1. For current VINTF, tegra241_vintf_hw_deinit(): a. Disable all LVCMDQ HWs b. Disable VINTF HW 2. Release all LVCMDQ SW resources Drop tegra241_vintf_remove_lvcmdq(), and move tegra241_vintf_free_lvcmdq() as the new step 2. Link: https://patch.msgid.link/r/86c97c8c4ee9ca192e7e7fa3007c10399d792ce6.1752126748.git.nicolinc@nvidia.com Acked-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-11iommu/tegra241-cmdqv: Use request_threaded_irqNicolin Chen
A vEVENT can be reported only from a threaded IRQ context. Change to using request_threaded_irq to support that. Link: https://patch.msgid.link/r/f160193980e3b273afbd1d9cfc3e360084c05ba6.1752126748.git.nicolinc@nvidia.com Acked-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-11iommu/arm-smmu-v3-iommufd: Add hw_info to impl_opsNicolin Chen
This will be used by Tegra241 CMDQV implementation to report a non-default HW info data. Link: https://patch.msgid.link/r/8a3bf5709358eb21aed2e8434534c30ecf83917c.1752126748.git.nicolinc@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-11iommu/arm-smmu-v3-iommufd: Add vsmmu_size/type and vsmmu_init impl opsNicolin Chen
An impl driver might want to allocate its own type of vIOMMU object or the standard IOMMU_VIOMMU_TYPE_ARM_SMMUV3 by setting up its own SW/HW bits, as the tegra241-cmdqv driver will add IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV. Add vsmmu_size/type and vsmmu_init to struct arm_smmu_impl_ops. Prioritize them in arm_smmu_get_viommu_size() and arm_vsmmu_init(). Link: https://patch.msgid.link/r/375ac2b056764534bb7c10ecc4f34a0bae82b108.1752126748.git.nicolinc@nvidia.com Reviewed-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-11iommu: Allow an input type in hw_info opNicolin Chen
The hw_info uAPI will support a bidirectional data_type field that can be used as an input field for user space to request for a specific info data. To prepare for the uAPI update, change the iommu layer first: - Add a new IOMMU_HW_INFO_TYPE_DEFAULT as an input, for which driver can output its only (or firstly) supported type - Update the kdoc accordingly - Roll out the type validation in the existing drivers Link: https://patch.msgid.link/r/00f4a2d3d930721f61367014717b3ba2d1e82a81.1752126748.git.nicolinc@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-10iommu: Pass in a driver-level user data structure to viommu_init opNicolin Chen
The new type of vIOMMU for tegra241-cmdqv allows user space VM to use one of its virtual command queue HW resources exclusively. This requires user space to mmap the corresponding MMIO page from kernel space for direct HW control. To forward the mmap info (offset and length), iommufd should add a driver specific data structure to the IOMMUFD_CMD_VIOMMU_ALLOC ioctl, for driver to output the info during the vIOMMU initialization back to user space. Similar to the existing ioctls and their IOMMU handlers, add a user_data to viommu_init op to bridge between iommufd and drivers. Link: https://patch.msgid.link/r/90bd5637dab7f5507c7a64d2c4826e70431e45a4.1752126748.git.nicolinc@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-10iommu: Use enum iommu_hw_info_type for type in hw_info opNicolin Chen
Replace u32 to make it clear. No functional changes. Also simplify the kdoc since the type itself is clear enough. Link: https://patch.msgid.link/r/651c50dee8ab900f691202ef0204cd5a43fdd6a2.1752126748.git.nicolinc@nvidia.com Reviewed-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-06-30iommu/arm: Add BBM Level 2 smmu featureMikołaj Lenczewski
For supporting BBM Level 2 for userspace mappings, we want to ensure that the smmu also supports its own version of BBM Level 2. Luckily, the smmu spec (IHI 0070G 3.21.1.3) is stricter than the aarch64 spec (DDI 0487K.a D8.16.2), so already guarantees that no aborts are raised when BBM level 2 is claimed. Add the feature and testing for it under arm_smmu_sva_supported(). Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20250625113435.26849-4-miko.lenczewski@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-06-27iommu/qcom: Remove iommu_ops pgsize_bitmapJason Gunthorpe
This driver just uses a constant, put it in domain_alloc_paging and use the domain's value instead of ops during init_domain. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/6-v2-68a2e1ba507c+1fb-iommu_rm_ops_pgsize_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-27iommu/arm-smmu: Remove iommu_ops pgsize_bitmapJason Gunthorpe
The driver never reads this value, arm_smmu_init_domain_context() always sets domain.pgsize_bitmap to smmu->pgsize_bitmap, the per-instance value. Remove the ops version entirely, the related dead code and make arm_smmu_ops const. Since this driver does not yet finalize the domain under arm_smmu_domain_alloc_paging() add a page size initialization to alloc so the page size is still setup prior to attach. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/2-v2-68a2e1ba507c+1fb-iommu_rm_ops_pgsize_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-27qiommu/arm-smmu-v3: Remove iommu_ops pgsize_bitmapJason Gunthorpe
The driver never reads this value, arm_smmu_domain_finalise() always sets domain.pgsize_bitmap to pgtbl_cfg, which comes from the per-smmu calculated value. Remove the ops version entirely, the related dead code and make arm_smmu_ops const. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/1-v2-68a2e1ba507c+1fb-iommu_rm_ops_pgsize_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-19iommu/arm-smmu-v3: Replace arm_vsmmu_alloc with arm_vsmmu_initNicolin Chen
To ease the for-driver iommufd APIs, get_viommu_size and viommu_init ops are introduced. Sanitize the inputs and report the size of struct arm_vsmmu on success, in arm_smmu_get_viommu_size(). Place the type sanity at the last, becase there will be soon an impl level get_viommu_size op, which will require the same sanity tests prior. It can simply insert a piece of code in front of the IOMMU_VIOMMU_TYPE_ARM_SMMUV3 sanity. The core will ensure the viommu_type is set to the core vIOMMU object, and pass in the same dev pointer, so arm_vsmmu_init() won't need to repeat the same sanity tests but to simply init the arm_vsmmu struct. Remove the arm_vsmmu_alloc, completing the replacement. Link: https://patch.msgid.link/r/64e4b4c33acd26e1bd676e077be80e00fb63f17c.1749882255.git.nicolinc@nvidia.com Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-05-31Revert "iommu: make inclusion of arm/arm-smmu-v3 directory conditional"Linus Torvalds
This reverts commit e436576b0231542f6f233279f0972989232575a8. That commit is very broken, and seems to have missed the fact that CONFIG_ARM_SMMU_V3 is not just a yes-or-no thing, but also can be modular. So it caused build errors on arm64 allmodconfig setups: ERROR: modpost: "arm_smmu_make_cdtable_ste" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined! ERROR: modpost: "arm_smmu_make_s2_domain_ste" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined! ERROR: modpost: "arm_smmu_make_s1_cd" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined! ... (and six more symbols just the same). Link: https://lore.kernel.org/all/CAHk-=wh4qRwm7AQ8sBmQj7qECzgAhj4r73RtCDfmHo5SdcN0Jw@mail.gmail.com/ Cc: Joerg Roedel <joro@8bytes.org> Cc: Rolf Eike Beer <eb@emlix.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-05-23Merge branches 'fixes', 'apple/dart', 'arm/smmu/updates', ↵Joerg Roedel
'arm/smmu/bindings', 'fsl/pamu', 'mediatek', 'renesas/ipmmu', 's390', 'intel/vt-d', 'amd/amd-vi' and 'core' into next
2025-05-21iommu/arm-smmu-qcom: Make set_stall work when the device is onConnor Abbott
Up until now we have only called the set_stall callback during initialization when the device is off. But we will soon start calling it to temporarily disable stall-on-fault when the device is on, so handle that by checking if the device is on and writing SCTLR. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Link: https://lore.kernel.org/r/20250520-msm-gpu-fault-fixes-next-v8-3-fce6ee218787@gmail.com [will: Fix "mixed declarations and code" warning from sparse] Signed-off-by: Will Deacon <will@kernel.org>
2025-05-21iommu/arm-smmu: Move handing of RESUME to the context fault handlerConnor Abbott
The upper layer fault handler is now expected to handle everything required to retry the transaction or dump state related to it, since we enable threaded IRQs. This means that we can take charge of writing RESUME, making sure that we always write it after writing FSR as recommended by the specification. The iommu handler should write -EAGAIN if a transaction needs to be retried. This avoids tricky cross-tree changes in drm/msm, since it never wants to retry the transaction and it already returns 0 from its fault handler. Therefore it will continue to correctly terminate the transaction without any changes required. devcoredumps from drm/msm will temporarily be broken until it is fixed to collect devcoredumps inside its fault handler, but fixing that first would actually be worse because MMU-500 ignores writes to RESUME unless all fields of FSR (except SS of course) are clear and raises an interrupt when only SS is asserted. Right now, things happen to work most of the time if we collect a devcoredump, because RESUME is written asynchronously in the fault worker after the fault handler clears FSR and finishes, although there will be some spurious faults, but if this is changed before this commit fixes the FSR/RESUME write order then SS will never be cleared, the interrupt will never be cleared, and the whole system will hang every time a fault happens. It will therefore help bisectability if this commit goes first. I've changed the TBU path to also accept -EAGAIN and do the same thing, while keeping the old -EBUSY behavior. Although the old path was broken because you'd get a storm of interrupts due to returning IRQ_NONE that would eventually result in the interrupt being disabled, and I think it was dead code anyway, so it should eventually be deleted. Note that drm/msm never uses TBU so this is untested. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Link: https://lore.kernel.org/r/20250520-msm-gpu-fault-fixes-next-v8-2-fce6ee218787@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-21iommu/arm-smmu-qcom: Enable threaded IRQ for Adreno SMMUv2/MMU500Connor Abbott
The recommended flow for stall-on-fault in SMMUv2 is the following: 1. Resolve the fault. 2. Write to FSR to clear the fault bits. 3. Write RESUME to retry or fail the transaction. MMU500 is designed with this sequence in mind. For example, experimentally we have seen on MMU500 that writing RESUME does not clear FSR.SS unless the original fault is cleared in FSR, so 2 must come before 3. FSR.SS is allowed to signal a fault (and does on MMU500) so that if we try to do 2 -> 1 -> 3 (while exiting from the fault handler after 2) we can get duplicate faults without hacks to disable interrupts. However, resolving the fault typically requires lengthy operations that can stall, like bringing in pages from disk. The only current user, drm/msm, dumps GPU state before failing the transaction which indeed can stall. Therefore, from now on we will require implementations that want to use stall-on-fault to also enable threaded IRQs. Do that with the Adreno MMU implementations. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Link: https://lore.kernel.org/r/20250520-msm-gpu-fault-fixes-next-v8-1-fce6ee218787@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-16iommu: make inclusion of arm/arm-smmu-v3 directory conditionalRolf Eike Beer
Nothing in there is active if CONFIG_ARM_SMMU_V3 is not enabled, so the whole directory can depend on that switch as well. Fixes: e86d1aa8b60f ("iommu/arm-smmu: Move Arm SMMU drivers into their own subdirectory") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/2434059.NG923GbCHz@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-06iommu/arm-smmu-qcom: Add SAR2130P MDSS compatibleDmitry Baryshkov
Add the SAR2130P compatible to clients compatible list, the device require identity domain. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250418-sar2130p-display-v5-9-442c905cb3a4@oss.qualcomm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-06iommu/arm-smmu-v3: Fix incorrect return in arm_smmu_attach_devQinxin Xia
After commit 48e7b8e284e5 ("iommu/arm-smmu-v3: Remove arm_smmu_domain_finalise() during attach"), an error code is not returned on the attach path when the smmu does not match with the domain. This causes problems with VFIO because vfio_iommu_type1_attach_group() relies on this check to determine domain compatability. Re-instate the -EINVAL return value when the SMMU doesn't match on the device attach path. Fixes: 48e7b8e284e5 ("iommu/arm-smmu-v3: Remove arm_smmu_domain_finalise() during attach") Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com> Link: https://lore.kernel.org/r/20250422112951.2027969-1-xiaqinxin@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-04-28iommu: Remove iommu_dev_enable/disable_feature()Lu Baolu
No external drivers use these interfaces anymore. Furthermore, no existing iommu drivers implement anything in the callbacks. Remove them to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20250418080130.1844424-9-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-28iommu: Remove IOMMU_DEV_FEAT_SVAJason Gunthorpe
None of the drivers implement anything here anymore, remove the dead code. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Link: https://lore.kernel.org/r/20250418080130.1844424-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-28iommu/arm-smmu-v3: Put iopf enablement in the domain attach pathJason Gunthorpe
SMMUv3 co-mingles FEAT_IOPF and FEAT_SVA behaviors so that fault reporting doesn't work unless both are enabled. This is not correct and causes problems for iommufd which does not enable FEAT_SVA for it's fault capable domains. These APIs are both obsolete, update SMMUv3 to use the new method like AMD implements. A driver should enable iopf support when a domain with an iopf_handler is attached, and disable iopf support when the domain is removed. Move the fault support logic to sva domain allocation and to domain attach, refusing to create or attach fault capable domains if the HW doesn't support it. Move all the logic for controlling the iopf queue under arm_smmu_attach_prepare(). Keep track of the number of domains on the master (over all the SSIDs) that require iopf. When the first domain requiring iopf is attached create the iopf queue, when the last domain is detached destroy it. Turn FEAT_IOPF and FEAT_SVA into no ops. Remove the sva_lock, this is all protected by the group mutex. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250418080130.1844424-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17iommu: Split out and tidy up Arm KconfigRobin Murphy
There are quite a lot of options for the Arm drivers, still all buried in the top-level Kconfig. For ease of use and consistency with all the other subdirectories, break these out into drivers/arm. For similar clarity and self-consistency, also tweak the ARM_SMMU sub-options to use "if" instead of "depends", to match ARM_SMMU_V3. Lastly also clean up the slightly messy description of ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT as highlighted by Geert - by now we really shouldn't need commentary on v4.x kernel behaviour anyway - and downgrade it to EXPERT as the first step in the 6-year-old threat to remove it entirely. Cc: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Link: https://lore.kernel.org/r/a614ec86ba78c09cd16e348f633f6bb38793391f.1742480488.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17iommu/arm-smmu-v3: Fail aliasing StreamIDs more gracefullyRobin Murphy
We've never supported StreamID aliasing between devices, and as such they will never have had functioning DMA, but this is not fatal to the SMMU itself. Although aliasing between hard-wired platform device StreamIDs would tend to raise questions about the whole system, in practice it's far more likely to occur relatively innocently due to legacy PCI bridges, where the underlying StreamID mappings are still perfectly reasonable. As such, return a more benign -ENODEV when failing probe for such an unsupported device (and log a more obvious error message), so that it doesn't break the entire SMMU probe now that bus_iommu_probe() runs in the right order and can propagate that error back. The end result is still that the device doesn't get an IOMMU group and probably won't work, same as before. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/39d54e49c8476efc4653e352150d44b185d6d50f.1744380554.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-04-17iommu/arm-smmu-v3: Fix iommu_device_probe bug due to duplicated stream idsNicolin Chen
ASPEED VGA card has two built-in devices: 0008:06:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 06) 0008:07:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 52) Its toplogy looks like this: +-[0008:00]---00.0-[01-09]--+-00.0-[02-09]--+-00.0-[03]----00.0 Sandisk Corp Device 5017 | +-01.0-[04]-- | +-02.0-[05]----00.0 NVIDIA Corporation Device | +-03.0-[06-07]----00.0-[07]----00.0 ASPEED Technology, Inc. ASPEED Graphics Family | +-04.0-[08]----00.0 Renesas Technology Corp. uPD720201 USB 3.0 Host Controller | \-05.0-[09]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller \-00.1 PMC-Sierra Inc. Device 4028 The IORT logic populaties two identical IDs into the fwspec->ids array via DMA aliasing in iort_pci_iommu_init() called by pci_for_each_dma_alias(). Though the SMMU driver had been able to handle this situation since commit 563b5cbe334e ("iommu/arm-smmu-v3: Cope with duplicated Stream IDs"), that got broken by the later commit cdf315f907d4 ("iommu/arm-smmu-v3: Maintain a SID->device structure"), which ended up with allocating separate streams with the same stuffing. On a kernel prior to v6.15-rc1, there has been an overlooked warning: pci 0008:07:00.0: vgaarb: setting as boot VGA device pci 0008:07:00.0: vgaarb: bridge control possible pci 0008:07:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none pcieport 0008:06:00.0: Adding to iommu group 14 ast 0008:07:00.0: stream 67328 already in tree <===== WARNING ast 0008:07:00.0: enabling device (0002 -> 0003) ast 0008:07:00.0: Using default configuration ast 0008:07:00.0: AST 2600 detected ast 0008:07:00.0: [drm] Using analog VGA ast 0008:07:00.0: [drm] dram MCLK=396 Mhz type=1 bus_width=16 [drm] Initialized ast 0.1.0 for 0008:07:00.0 on minor 0 ast 0008:07:00.0: [drm] fb0: astdrmfb frame buffer device With v6.15-rc, since the commit bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path"), the error returned with the warning is moved to the SMMU device probe flow: arm_smmu_probe_device+0x15c/0x4c0 __iommu_probe_device+0x150/0x4f8 probe_iommu_group+0x44/0x80 bus_for_each_dev+0x7c/0x100 bus_iommu_probe+0x48/0x1a8 iommu_device_register+0xb8/0x178 arm_smmu_device_probe+0x1350/0x1db0 which then fails the entire SMMU driver probe: pci 0008:06:00.0: Adding to iommu group 21 pci 0008:07:00.0: stream 67328 already in tree arm-smmu-v3 arm-smmu-v3.9.auto: Failed to register iommu arm-smmu-v3 arm-smmu-v3.9.auto: probe with driver arm-smmu-v3 failed with error -22 Since SMMU driver had been already expecting a potential duplicated Stream ID in arm_smmu_install_ste_for_dev(), change the arm_smmu_insert_master() routine to ignore a duplicated ID from the fwspec->sids array as well. Note: this has been failing the iommu_device_probe() since 2021, although a recent iommu commit in v6.15-rc1 that moves iommu_device_probe() started to fail the SMMU driver probe. Since nobody has cared about DMA Alias support, leave that as it was but fix the fundamental iommu_device_probe() breakage. Fixes: cdf315f907d4 ("iommu/arm-smmu-v3: Maintain a SID->device structure") Cc: stable@vger.kernel.org Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20250415185620.504299-1-nicolinc@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2025-04-17iommu/arm-smmu-v3: Fix pgsize_bit for sva domainsBalbir Singh
UBSan caught a bug with IOMMU SVA domains, where the reported exponent value in __arm_smmu_tlb_inv_range() was >= 64. __arm_smmu_tlb_inv_range() uses the domain's pgsize_bitmap to compute the number of pages to invalidate and the invalidation range. Currently arm_smmu_sva_domain_alloc() does not setup the iommu domain's pgsize_bitmap. This leads to __ffs() on the value returning 64 and that leads to undefined behaviour w.r.t. shift operations Fix this by initializing the iommu_domain's pgsize_bitmap to PAGE_SIZE. Effectively the code needs to use the smallest page size for invalidation Cc: stable@vger.kernel.org Fixes: eb6c97647be2 ("iommu/arm-smmu-v3: Avoid constructing invalid range commands") Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Balbir Singh <balbirs@nvidia.com> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20250412002354.3071449-1-balbirs@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2025-04-17iommu/arm-smmu-v3: Add missing S2FWB feature detectionAneesh Kumar K.V (Arm)
Commit 67e4fe398513 ("iommu/arm-smmu-v3: Use S2FWB for NESTED domains") introduced S2FWB usage but omitted the corresponding feature detection. As a result, vIOMMU allocation fails on FVP in arm_vsmmu_alloc(), due to the following check: if (!arm_smmu_master_canwbs(master) && !(smmu->features & ARM_SMMU_FEAT_S2FWB)) return ERR_PTR(-EOPNOTSUPP); This patch adds the missing detection logic to prevent allocation failure when S2FWB is supported. Fixes: 67e4fe398513 ("iommu/arm-smmu-v3: Use S2FWB for NESTED domains") Signed-off-by: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Link: https://lore.kernel.org/r/20250408033351.1012411-1-aneesh.kumar@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2025-04-11iommu/tegra241-cmdqv: Fix warnings due to dmam_free_coherent()Nicolin Chen
Two WARNINGs are observed when SMMU driver rolls back upon failure: arm-smmu-v3.9.auto: Failed to register iommu arm-smmu-v3.9.auto: probe with driver arm-smmu-v3 failed with error -22 ------------[ cut here ]------------ WARNING: CPU: 5 PID: 1 at kernel/dma/mapping.c:74 dmam_free_coherent+0xc0/0xd8 Call trace: dmam_free_coherent+0xc0/0xd8 (P) tegra241_vintf_free_lvcmdq+0x74/0x188 tegra241_cmdqv_remove_vintf+0x60/0x148 tegra241_cmdqv_remove+0x48/0xc8 arm_smmu_impl_remove+0x28/0x60 devm_action_release+0x1c/0x40 ------------[ cut here ]------------ 128 pages are still in use! WARNING: CPU: 16 PID: 1 at mm/page_alloc.c:6902 free_contig_range+0x18c/0x1c8 Call trace: free_contig_range+0x18c/0x1c8 (P) cma_release+0x154/0x2f0 dma_free_contiguous+0x38/0xa0 dma_direct_free+0x10c/0x248 dma_free_attrs+0x100/0x290 dmam_free_coherent+0x78/0xd8 tegra241_vintf_free_lvcmdq+0x74/0x160 tegra241_cmdqv_remove+0x98/0x198 arm_smmu_impl_remove+0x28/0x60 devm_action_release+0x1c/0x40 This is because the LVCMDQ queue memory are managed by devres, while that dmam_free_coherent() is called in the context of devm_action_release(). Jason pointed out that "arm_smmu_impl_probe() has mis-ordered the devres callbacks if ops->device_remove() is going to be manually freeing things that probe allocated": https://lore.kernel.org/linux-iommu/20250407174408.GB1722458@nvidia.com/ In fact, tegra241_cmdqv_init_structures() only allocates memory resources which means any failure that it generates would be similar to -ENOMEM, so there is no point in having that "falling back to standard SMMU" routine, as the standard SMMU would likely fail to allocate memory too. Remove the unwind part in tegra241_cmdqv_init_structures(), and return a proper error code to ask SMMU driver to call tegra241_cmdqv_remove() via impl_ops->device_remove(). Then, drop tegra241_vintf_free_lvcmdq() since devres will take care of that. Fixes: 483e0bd8883a ("iommu/tegra241-cmdqv: Do not allocate vcmdq until dma_set_mask_and_coherent") Cc: stable@vger.kernel.org Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20250407201908.172225-1-nicolinc@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-01Merge tag 'for-linus-iommufd' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd updates from Jason Gunthorpe: "Two significant new items: - Allow reporting IOMMU HW events to userspace when the events are clearly linked to a device. This is linked to the VIOMMU object and is intended to be used by a VMM to forward HW events to the virtual machine as part of emulating a vIOMMU. ARM SMMUv3 is the first driver to use this mechanism. Like the existing fault events the data is delivered through a simple FD returning event records on read(). - PASID support in VFIO. The "Process Address Space ID" is a PCI feature that allows the device to tag all PCI DMA operations with an ID. The IOMMU will then use the ID to select a unique translation for those DMAs. This is part of Intel's vIOMMU support as VT-D HW requires the hypervisor to manage each PASID entry. The support is generic so any VFIO user could attach any translation to a PASID, and the support should work on ARM SMMUv3 as well. AMD requires additional driver work. Some minor updates, along with fixes: - Prevent using nested parents with fault's, no driver support today - Put a single "cookie_type" value in the iommu_domain to indicate what owns the various opaque owner fields" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (49 commits) iommufd: Test attach before detaching pasid iommufd: Fix iommu_vevent_header tables markup iommu: Convert unreachable() to BUG() iommufd: Balance veventq->num_events inc/dec iommufd: Initialize the flags of vevent in iommufd_viommu_report_event() iommufd/selftest: Add coverage for reporting max_pasid_log2 via IOMMU_HW_INFO iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices ida: Add ida_find_first_range() iommufd/selftest: Add coverage for iommufd pasid attach/detach iommufd/selftest: Add test ops to test pasid attach/detach iommufd/selftest: Add a helper to get test device iommufd/selftest: Add set_dev_pasid in mock iommu iommufd: Allow allocating PASID-compatible domain iommu/vt-d: Add IOMMU_HWPT_ALLOC_PASID support iommufd: Enforce PASID-compatible domain for RID iommufd: Support pasid attach/replace iommufd: Enforce PASID-compatible domain in PASID path iommufd/device: Add pasid_attach array to track per-PASID attach ...
2025-03-20Merge branches 'apple/dart', 'arm/smmu/updates', 'arm/smmu/bindings', ↵Joerg Roedel
'rockchip', 's390', 'core', 'intel/vt-d' and 'amd/amd-vi' into next
2025-03-18iommu/arm-smmu-v3: Set MEV bit in nested STE for DoS mitigationsNicolin Chen
There is a DoS concern on the shared hardware event queue among devices passed through to VMs, that too many translation failures that belong to VMs could overflow the shared hardware event queue if those VMs or their VMMs don't handle/recover the devices properly. The MEV bit in the STE allows to configure the SMMU HW to merge similar event records, though there is no guarantee. Set it in a nested STE for DoS mitigations. In the future, we might want to enable the MEV for non-nested cases too such as domain->type == IOMMU_DOMAIN_UNMANAGED or even IOMMU_DOMAIN_DMA. Link: https://patch.msgid.link/r/8ed12feef67fc65273d0f5925f401a81f56acebe.1741719725.git.nicolinc@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-18iommu/arm-smmu-v3: Report events that belong to devices attached to vIOMMUNicolin Chen
Aside from the IOPF framework, iommufd provides an additional pathway to report hardware events, via the vEVENTQ of vIOMMU infrastructure. Define an iommu_vevent_arm_smmuv3 uAPI structure, and report stage-1 events in the threaded IRQ handler. Also, add another four event record types that can be forwarded to a VM. Link: https://patch.msgid.link/r/5cf6719682fdfdabffdb08374cdf31ad2466d75a.1741719725.git.nicolinc@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-18iommu/arm-smmu-v3: Introduce struct arm_smmu_vmasterNicolin Chen
Use it to store all vSMMU-related data. The vsid (Virtual Stream ID) will be the first use case. Since the vsid reader will be the eventq handler that already holds a streams_mutex, reuse that to fence the vmaster too. Also add a pair of arm_smmu_attach_prepare/commit_vmaster helpers to set or unset the master->vmaster pointer. Put the helpers inside the existing arm_smmu_attach_prepare/commit(). For identity/blocked ops that don't call arm_smmu_attach_prepare/commit(), add a simpler arm_smmu_master_clear_vmaster helper to unset the vmaster. Link: https://patch.msgid.link/r/a7f282e1a531279e25f06c651e95d56f6b120886.1741719725.git.nicolinc@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-11iommu/arm-smmu: Set rpm auto_suspend once during probePranjal Shrivastava
The current code calls arm_smmu_rpm_use_autosuspend() during device attach, which seems unusual as it sets the autosuspend delay and the 'use_autosuspend' flag for the smmu device. These parameters can be simply set once during the smmu probe and in order to avoid bouncing rpm states, we can simply mark_last_busy() during a client dev attach as discussed in [1]. Move the handling of arm_smmu_rpm_use_autosuspend() to the SMMU probe and modify the arm_smmu_rpm_put() function to mark_last_busy() before calling __pm_runtime_put_autosuspend(). Additionally, s/pm_runtime_put_autosuspend/__pm_runtime_put_autosuspend/ to help with the refactor of the pm_runtime_put_autosuspend() API [2]. Link: https://lore.kernel.org/r/20241023164835.GF29251@willie-the-truck [1] Link: https://git.kernel.org/linus/b7d46644e554 [2] Signed-off-by: Pranjal Shrivastava <praan@google.com> Link: https://lore.kernel.org/r/20250123195636.4182099-1-praan@google.com Signed-off-by: Will Deacon <will@kernel.org>