summaryrefslogtreecommitdiff
path: root/drivers/iommu
AgeCommit message (Collapse)Author
2023-03-28iommu/exynos: Fix set_platform_dma_ops() callbackMarek Szyprowski
There are some subtle differences between release_device() and set_platform_dma_ops() callbacks, so separate those two callbacks. Device links should be removed only in release_device(), because they were created in probe_device() on purpose and they are needed for proper Exynos IOMMU driver operation. While fixing this, remove the conditional code as it is not really needed. Reported-by: Jason Gunthorpe <jgg@ziepe.ca> Fixes: 189d496b48b1 ("iommu/exynos: Add missing set_platform_dma_ops callback") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org> Link: https://lore.kernel.org/r/20230315232514.1046589-1-m.szyprowski@samsung.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-28iommu/amd: Add 5 level guest page table supportVasant Hegde
Newer AMD IOMMU supports 5 level guest page table (v2 page table). If both processor and IOMMU supports 5 level page table then enable it. Otherwise fall back to 4 level page table. Co-developed-by: Wei Huang <wei.huang2@amd.com> Signed-off-by: Wei Huang <wei.huang2@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20230310090000.1117786-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-28iommu/mediatek: Set dma_mask for PGTABLE_PA_35_ENYong Wu
When we enable PGTABLE_PA_35_EN, the PA for pgtable may be 35bits. Thus add dma_mask for it. Fixes: 301c3ca12576 ("iommu/mediatek: Allow page table PA up to 35bit") Signed-off-by: Chengci.Xu <chengci.xu@mediatek.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20230316101445.12443-1-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-27iommu/arm-smmu-qcom: Limit the SMR groups to 128Manivannan Sadhasivam
Some platforms support more than 128 stream matching groups than what is defined by the ARM SMMU architecture specification. But due to some unknown reasons, those additional groups don't exhibit the same behavior as the architecture supported ones. For instance, the additional groups will not detect the quirky behavior of some firmware versions intercepting writes to S2CR register, thus skipping the quirk implemented in the driver and causing boot crash. So let's limit the groups to 128 for now until the issue with those groups are fixed and issue a notice to users in that case. Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20230327080029.11584-1-manivannan.sadhasivam@linaro.org [will: Reworded the comment slightly] Signed-off-by: Will Deacon <will@kernel.org>
2023-03-27iommu/arm-smmu-v3: Explain why ATS stays disabled with bypassJean-Philippe Brucker
The SMMU does not support enabling ATS for a bypass stream. Add a comment. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20230321100559.341981-1-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2023-03-23iommu: make the pointer to struct bus_type constantGreg Kroah-Hartman
A number of iommu functions take a struct bus_type * and never modify the data passed in, so make them all const * as that is what the driver core is expecting to have passed into as well. This is a step toward making all struct bus_type pointers constant in the kernel. Cc: Will Deacon <will@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: iommu@lists.linux.dev Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20230313182918.1312597-34-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-22iommu: Use sysfs_emit() for sysfs showLu Baolu
Use sysfs_emit() instead of the sprintf() for sysfs entries. sysfs_emit() knows the maximum of the temporary buffer used for outputting sysfs content and avoids overrunning the buffer length. Prefer 'long long' over 'long long int' as suggested by checkpatch.pl. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230322123421.278852-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu: Cleanup iommu_change_dev_def_domain()Lu Baolu
As the singleton group limitation has been removed, cleanup the code in iommu_change_dev_def_domain() accordingly. Documentation is also updated. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230322064956.263419-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu: Replace device_lock() with group->mutexLu Baolu
device_lock() was used in iommu_group_store_type() to prevent the devices in an iommu group from being attached by any device driver. On the other hand, in order to avoid lock race between group->mutex and device_lock(), it limited the usage scenario to the singleton groups. We already have the DMA ownership scheme to avoid driver attachment and group->mutex ensures that device ops are always valid, there's no need for device_lock() anymore. Remove device_lock() and the singleton group limitation. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230322064956.263419-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu: Move lock from iommu_change_dev_def_domain() to its callerLu Baolu
The intention is to make it possible to put group ownership check and default domain change in a same critical region protected by the group's mutex lock. No intentional functional change. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230322064956.263419-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu: Same critical region for device release and removalLu Baolu
In a non-driver context, it is crucial to ensure the consistency of a device's iommu ops. Otherwise, it may result in a situation where a device is released but it's iommu ops are still used. Put the ops->release_device and __iommu_group_remove_device() in a same group->mutext critical region, so that, as long as group->mutex is held and the device is in its group's device list, its iommu ops are always consistent. Add check of group ownership if the released device is the last one. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230322064956.263419-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu: Split iommu_group_remove_device() into helpersLu Baolu
So that code could be re-used by iommu_release_device() in the subsequent change. No intention for functionality change. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230322064956.263419-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu/ipmmu-vmsa: Call arm_iommu_release_mapping() in release pathLu Baolu
In the iommu driver's release_device operation, the driver should detach the device from any attached domain and release the resources allocated in the probe_device and probe_finalize paths. Replace arm_iommu_detach_device() with arm_iommu_release_mapping() in the release path of the ipmmu-vmsa driver. The device_release callback is called in device_del(), this device is not coming back. Zeroing out pointers and testing for a condition which cannot be true by construction is simply a waste of time and code. The bonus is that it also removes a obstacle of arm_iommu_detach_device() re-entering the iommu core during release_device. With this removed, the iommu core code could be simplified a lot. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Suggested-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/linux-iommu/7b248ba1-3967-5cd8-82e9-0268c706d320@arm.com/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230322064956.263419-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu/amd: Allocate IOMMU irqs using numa locality infoVasant Hegde
Use numa information to allocate irq resources and also to set irq affinity. This optimizes the IOMMU interrupt handling. Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20230321092348.6127-3-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu/amd: Allocate page table using numa locality infoVasant Hegde
Introduce 'struct protection_domain->nid' variable. It will contain IOMMU NUMA node ID. And allocate page table pages using IOMMU numa locality info. This optimizes page table walk by IOMMU. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20230321092348.6127-2-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu/omap: Use of_property_read_bool() for boolean propertiesRob Herring
It is preferred to use typed property access functions (i.e. of_property_read_<type> functions) rather than low-level of_get_property/of_find_property functions for reading properties. Convert reading boolean properties to of_property_read_bool(). Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20230310144709.1542980-1-robh@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu: Use of_property_present() for testing DT property presenceRob Herring
It is preferred to use typed property access functions (i.e. of_property_read_<type> functions) rather than low-level of_get_property/of_find_property functions for reading properties. As part of this, convert of_get_property/of_find_property calls to the recently added of_property_present() helper when we just want to test for presence of a property and nothing more. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20230310144709.1542910-1-robh@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu: Spelling s/cpmxchg64/cmpxchg64/Geert Uytterhoeven
Fix misspellings of "cmpxchg64" Fixes: d286a58bc8f4d5cf ("iommu: Tidy up io-pgtable dependencies") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/eab156858147249d44463662eb9192202c39ab9f.1678295792.git.geert+renesas@glider.be Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu/fsl: fix all kernel-doc warnings in fsl_pamu.cRandy Dunlap
Fix kernel-doc warnings as reported by the kernel test robot: fsl_pamu.c:192: warning: expecting prototype for pamu_config_paace(). Prototype was for pamu_config_ppaace() instead fsl_pamu.c:239: warning: Function parameter or member 'omi_index' not described in 'get_ome_index' fsl_pamu.c:239: warning: Function parameter or member 'dev' not described in 'get_ome_index' fsl_pamu.c:332: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Setup operation mapping and stash destinations for QMAN and QMAN portal. fsl_pamu.c:361: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Setup the operation mapping table for various devices. This is a static Fixes: 695093e38c3e ("iommu/fsl: Freescale PAMU driver and iommu implementation.") Fixes: cd70d4659ff3 ("iommu/fsl: Various cleanups") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Link: lore.kernel.org/r/202302281151.B1WtZvSC-lkp@intel.com Cc: Aditya Srivastava <yashsri421@gmail.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Will Deacon <will@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: iommu@lists.linux.dev Cc: Timur Tabi <timur@tabi.org> Cc: Varun Sethi <Varun.Sethi@freescale.com> Cc: Emil Medve <Emilian.Medve@Freescale.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Timur Tabi <timur@kernel.org> Link: https://lore.kernel.org/r/20230308034504.9985-1-rdunlap@infradead.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu/ipmmu-vmsa: remove R-Car H3 ES1.* handlingWolfram Sang
R-Car H3 ES1.* was only available to an internal development group and needed a lot of quirks and workarounds. These become a maintenance burden now, so our development group decided to remove upstream support and disable booting for this SoC. Public users only have ES2 onwards. Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20230307163041.3815-2-wsa+renesas@sang-engineering.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu/sun50i: remove MODULE_LICENSE in non-modulesNick Alcock
Since commit 8b41fc4454e ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations are used to identify modules. As a consequence, uses of the macro in non-modules will cause modprobe to misidentify their containing object file as a module when it is not (false positives), and modprobe might succeed rather than failing with a suitable error message. So remove it in the files in this commit, none of which can be built as modules. Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Suggested-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: linux-modules@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Will Deacon <will@kernel.org> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Jernej Skrabec <jernej.skrabec@gmail.com> Cc: Samuel Holland <samuel@sholland.org> Cc: Philipp Zabel <p.zabel@pengutronix.de> Cc: iommu@lists.linux.dev Cc: linux-arm-kernel@lists.infradead.org Cc: linux-sunxi@lists.linux.dev Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://lore.kernel.org/r/20230224150811.80316-8-nick.alcock@oracle.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-22iommu: Make kobj_type structure constantThomas Weißschuh
Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.") the driver core allows the usage of const struct kobj_type. Take advantage of this to constify the structure definition to prevent modification at runtime. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20230214-kobj_type-iommu-v1-1-e7392834b9d0@weissschuh.net Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-16x86/mm/iommu/sva: Make LAM and SVA mutually exclusiveKirill A. Shutemov
IOMMU and SVA-capable devices know nothing about LAM and only expect canonical addresses. An attempt to pass down tagged pointer will lead to address translation failure. By default do not allow to enable both LAM and use SVA in the same process. The new ARCH_FORCE_TAGGED_SVA arch_prctl() overrides the limitation. By using the arch_prctl() userspace takes responsibility to never pass tagged address to the device. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20230312112612.31869-12-kirill.shutemov%40linux.intel.com
2023-03-16iommu/sva: Replace pasid_valid() helper with mm_valid_pasid()Kirill A. Shutemov
Kernel has few users of pasid_valid() and all but one checks if the process has PASID allocated. The helper takes ioasid_t as the input. Replace the helper with mm_valid_pasid() that takes mm_struct as the argument. The only call that checks PASID that is not tied to mm_struct is open-codded now. This is preparatory patch. It helps avoid ifdeffery: no need to dereference mm->pasid in generic code to check if the process has PASID. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20230312112612.31869-11-kirill.shutemov%40linux.intel.com
2023-03-10iommufd/selftest: Catch overflow of uptr and lengthJason Gunthorpe
syzkaller hits a WARN_ON when trying to have a uptr close to UINTPTR_MAX: WARNING: CPU: 1 PID: 393 at drivers/iommu/iommufd/selftest.c:403 iommufd_test+0xb19/0x16f0 Modules linked in: CPU: 1 PID: 393 Comm: repro Not tainted 6.2.0-c9c3395d5e3d #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:iommufd_test+0xb19/0x16f0 Code: 94 c4 31 ff 44 89 e6 e8 a5 54 17 ff 45 84 e4 0f 85 bb 0b 00 00 41 be fb ff ff ff e8 31 53 17 ff e9 a0 f7 ff ff e8 27 53 17 ff <0f> 0b 41 be 8 RSP: 0018:ffffc90000eabdc0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff8214c487 RDX: 0000000000000000 RSI: ffff88800f5c8000 RDI: 0000000000000002 RBP: ffffc90000eabe48 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000000 R12: 00000000cd2b0000 R13: 00000000cd2af000 R14: 0000000000000000 R15: ffffc90000eabe68 FS: 00007f94d76d5740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000043 CR3: 0000000006880006 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace: <TASK> ? write_comp_data+0x2f/0x90 iommufd_fops_ioctl+0x1ef/0x310 __x64_sys_ioctl+0x10e/0x160 ? __pfx_iommufd_fops_ioctl+0x10/0x10 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc Check that the user memory range doesn't overflow. Fixes: f4b20bb34c83 ("iommufd: Add kernel support for testing iommufd") Link: https://lore.kernel.org/r/0-v1-95390ed1df8d+8f-iommufd_mock_overflow_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reported-by: Pengfei Xu <pengfei.xu@intel.com> Link: https://lore.kernel.org/r/Y/hOiilV1wJvu/Hv@xpf.sh.intel.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-03-06iommufd/selftest: Make selftest create a more complete mock deviceJason Gunthorpe
iommufd wants to use more infrastructure, like the iommu_group, that the mock device does not support. Create a more complete mock device that can go through the whole cycle of ownership, blocking domain, and has an iommu_group. This requires creating a real struct device on a real bus to be able to connect it to a iommu_group. Unfortunately we cannot formally attach the mock iommu driver as an actual driver as the iommu core does not allow more than one driver or provide a general way for busses to link to iommus. This can be solved with a little hack to open code the dev_iommus struct. With this infrastructure things work exactly the same as the normal domain path, including the auto domains mechanism and direct attach of hwpts. As the created hwpt is now an autodomain it is no longer required to destroy it and trying to do so will trigger a failure. Link: https://lore.kernel.org/r/11-v3-ae9c2975a131+2e1e8-iommufd_hwpt_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-03-06iommufd/selftest: Rename the sefltest 'device_id' to 'stdev_id'Jason Gunthorpe
It is too confusing now that we have the 'dev_id' as part of the main interface. Make it clear this is the special selftest device object. This object is analogous to the VFIO device FD. Link: https://lore.kernel.org/r/7-v3-ae9c2975a131+2e1e8-iommufd_hwpt_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-03-06iommufd: Make iommufd_hw_pagetable_alloc() do iopt_table_add_domain()Jason Gunthorpe
The HWPT is always linked to an IOAS and once a HWPT exists its domain should be fully mapped. This ended up being split up into device.c during a two phase creation that was a bit confusing. Move the iopt_table_add_domain() into iommufd_hw_pagetable_alloc() by having it call back to device.c to complete the domain attach in the required order. Calling iommufd_hw_pagetable_alloc() with immediate_attach = false will work on most drivers, but notably the SMMU drivers will fail because they can't decide what kind of domain to create until they are attached. This will be fixed when the domain_alloc function can take in a struct device. Link: https://lore.kernel.org/r/6-v3-ae9c2975a131+2e1e8-iommufd_hwpt_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-03-06iommufd: Move iommufd_device to iommufd_private.hJason Gunthorpe
hw_pagetable.c will need this in the next patches. Link: https://lore.kernel.org/r/5-v3-ae9c2975a131+2e1e8-iommufd_hwpt_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-03-06iommufd: Move ioas related HWPT destruction into iommufd_hw_pagetable_destroy()Jason Gunthorpe
A HWPT is permanently associated with an IOAS when it is created, remove the strange situation where a refcount != 0 HWPT can have been disconnected from the IOAS by putting all the IOAS related destruction in the object destroy function. Initializing a HWPT is two stages, we have to allocate it, attach it to a device and then populate the domain. Once the domain is populated it is fully linked to the IOAS. Arrange things so that all the error unwinds flow through the iommufd_hw_pagetable_destroy() and allow it to handle all cases. Link: https://lore.kernel.org/r/4-v3-ae9c2975a131+2e1e8-iommufd_hwpt_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-03-06iommufd: Consistently manage hwpt_itemJason Gunthorpe
This should be added immediately after every iopt_table_add_domain(), and deleted after every iopt_table_remove_domain() under the ioas->mutex. Tidy things to be consistent. Link: https://lore.kernel.org/r/3-v3-ae9c2975a131+2e1e8-iommufd_hwpt_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-03-06iommufd: Add iommufd_lock_obj() around the auto-domains hwptsJason Gunthorpe
A later patch will require this locking - currently under the ioas mutex the hwpt can not have a 0 reference and be on the list. Link: https://lore.kernel.org/r/2-v3-ae9c2975a131+2e1e8-iommufd_hwpt_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-03-06iommufd: Assert devices_lock for iommufd_hw_pagetable_has_group()Jason Gunthorpe
The hwpt->devices list is locked by this, make it clearer. Link: https://lore.kernel.org/r/1-v3-ae9c2975a131+2e1e8-iommufd_hwpt_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-02-27Merge tag 'soc-drivers-6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC driver updates from Arnd Bergmann: "As usual, there are lots of minor driver changes across SoC platforms from NXP, Amlogic, AMD Zynq, Mediatek, Qualcomm, Apple and Samsung. These usually add support for additional chip variations in existing drivers, but also add features or bugfixes. The SCMI firmware subsystem gains a unified raw userspace interface through debugfs, which can be used for validation purposes. Newly added drivers include: - New power management drivers for StarFive JH7110, Allwinner D1 and Renesas RZ/V2M - A driver for Qualcomm battery and power supply status - A SoC device driver for identifying Nuvoton WPCM450 chips - A regulator coupler driver for Mediatek MT81xxv" * tag 'soc-drivers-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (165 commits) power: supply: Introduce Qualcomm PMIC GLINK power supply soc: apple: rtkit: Do not copy the reg state structure to the stack soc: sunxi: SUN20I_PPU should depend on PM memory: renesas-rpc-if: Remove redundant division of dummy soc: qcom: socinfo: Add IDs for IPQ5332 and its variant dt-bindings: arm: qcom,ids: Add IDs for IPQ5332 and its variant dt-bindings: power: qcom,rpmpd: add RPMH_REGULATOR_LEVEL_LOW_SVS_L1 firmware: qcom_scm: Move qcom_scm.h to include/linux/firmware/qcom/ MAINTAINERS: Update qcom CPR maintainer entry dt-bindings: firmware: document Qualcomm SM8550 SCM dt-bindings: firmware: qcom,scm: add qcom,scm-sa8775p compatible soc: qcom: socinfo: Add Soc IDs for IPQ8064 and variants dt-bindings: arm: qcom,ids: Add Soc IDs for IPQ8064 and variants soc: qcom: socinfo: Add support for new field in revision 17 soc: qcom: smd-rpm: Add IPQ9574 compatible soc: qcom: pmic_glink: remove redundant calculation of svid soc: qcom: stats: Populate all subsystem debugfs files dt-bindings: soc: qcom,rpmh-rsc: Update to allow for generic nodes soc: qcom: pmic_glink: add CONFIG_NET/CONFIG_OF dependencies soc: qcom: pmic_glink: Introduce altmode support ...
2023-02-24Merge tag 'for-linus-iommufd' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd updates from Jason Gunthorpe: "Some polishing and small fixes for iommufd: - Remove IOMMU_CAP_INTR_REMAP, instead rely on the interrupt subsystem - Use GFP_KERNEL_ACCOUNT inside the iommu_domains - Support VFIO_NOIOMMU mode with iommufd - Various typos - A list corruption bug if HWPTs are used for attach" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: iommufd: Do not add the same hwpt to the ioas->hwpt_list twice iommufd: Make sure to zero vfio_iommu_type1_info before copying to user vfio: Support VFIO_NOIOMMU with iommufd iommufd: Add three missing structures in ucmd_buffer selftests: iommu: Fix test_cmd_destroy_access() call in user_copy iommu: Remove IOMMU_CAP_INTR_REMAP irq/s390: Add arch_is_isolated_msi() for s390 iommu/x86: Replace IOMMU_CAP_INTR_REMAP with IRQ_DOMAIN_FLAG_ISOLATED_MSI genirq/msi: Rename IRQ_DOMAIN_MSI_REMAP to IRQ_DOMAIN_ISOLATED_MSI genirq/irqdomain: Remove unused irq_domain_check_msi_remap() code iommufd: Convert to msi_device_has_isolated_msi() vfio/type1: Convert to iommu_group_has_isolated_msi() iommu: Add iommu_group_has_isolated_msi() genirq/msi: Add msi_device_has_isolated_msi()
2023-02-24Merge tag 'iommu-updates-v6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - Consolidate iommu_map/unmap functions. There have been blocking and atomic variants so far, but that was problematic as this approach does not scale with required new variants which just differ in the GFP flags used. So Jason consolidated this back into single functions that take a GFP parameter. - Retire the detach_dev() call-back in iommu_ops - Arm SMMU updates from Will: - Device-tree binding updates: - Cater for three power domains on SM6375 - Document existing compatible strings for Qualcomm SoCs - Tighten up clocks description for platform-specific compatible strings - Enable Qualcomm workarounds for some additional platforms that need them - Intel VT-d updates from Lu Baolu: - Add Intel IOMMU performance monitoring support - Set No Execute Enable bit in PASID table entry - Two performance optimizations - Fix PASID directory pointer coherency - Fix missed rollbacks in error path - Cleanups - Apple t8110 DART support - Exynos IOMMU: - Implement better fault handling - Error handling fixes - Renesas IPMMU: - Add device tree bindings for r8a779g0 - AMD IOMMU: - Various fixes for handling on SNP-enabled systems and handling of faults with unknown request-ids - Cleanups and other small fixes - Various other smaller fixes and cleanups * tag 'iommu-updates-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (71 commits) iommu/amd: Skip attach device domain is same as new domain iommu: Attach device group to old domain in error path iommu/vt-d: Allow to use flush-queue when first level is default iommu/vt-d: Fix PASID directory pointer coherency iommu/vt-d: Avoid superfluous IOTLB tracking in lazy mode iommu/vt-d: Fix error handling in sva enable/disable paths iommu/amd: Improve page fault error reporting iommu/amd: Do not identity map v2 capable device when snp is enabled iommu: Fix error unwind in iommu_group_alloc() iommu/of: mark an unused function as __maybe_unused iommu: dart: DART_T8110_ERROR range should be 0 to 5 iommu/vt-d: Enable IOMMU perfmon support iommu/vt-d: Add IOMMU perfmon overflow handler support iommu/vt-d: Support cpumask for IOMMU perfmon iommu/vt-d: Add IOMMU perfmon support iommu/vt-d: Support Enhanced Command Interface iommu/vt-d: Retrieve IOMMU perfmon capability information iommu/vt-d: Support size of the register set in DRHD iommu/vt-d: Set No Execute Enable bit in PASID table entry iommu/vt-d: Remove sva from intel_svm_dev ...
2023-02-21Merge tag 'v6.2' into iommufd.git for-nextJason Gunthorpe
Resolve conflicts from the signature change in iommu_map: - drivers/infiniband/hw/usnic/usnic_uiom.c Switch iommu_map_atomic() to iommu_map(.., GFP_ATOMIC) - drivers/vfio/vfio_iommu_type1.c Following indenting change for GFP_KERNEL Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-02-18Merge branches 'apple/dart', 'arm/exynos', 'arm/renesas', 'arm/smmu', ↵Joerg Roedel
'x86/vt-d', 'x86/amd' and 'core' into next
2023-02-18iommu/amd: Skip attach device domain is same as new domainVasant Hegde
If device->domain is same as new domain then we can skip the device attach process. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20230215052642.6016-2-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-02-18iommu: Attach device group to old domain in error pathVasant Hegde
iommu_attach_group() attaches all devices in a group to domain and then sets group domain (group->domain). Current code (__iommu_attach_group()) does not handle error path. This creates problem as devices to domain attachment is in inconsistent state. Flow: - During boot iommu attach devices to default domain - Later some device driver (like amd/iommu_v2 or vfio) tries to attach device to new domain. - In iommu_attach_group() path we detach device from current domain. Then it tries to attach devices to new domain. - If it fails to attach device to new domain then device to domain link is broken. - iommu_attach_group() returns error. - At this stage iommu_attach_group() caller thinks, attaching device to new domain failed and devices are still attached to old domain. - But in reality device to old domain link is broken. It will result in all sort of failures (like IO page fault) later. To recover from this situation, we need to attach all devices back to the old domain. Also log warning if it fails attach device back to old domain. Suggested-by: Lu Baolu <baolu.lu@linux.intel.com> Reported-by: Matt Fagnani <matt.fagnani@bell.net> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Matt Fagnani <matt.fagnani@bell.net> Link: https://lore.kernel.org/r/20230215052642.6016-1-vasant.hegde@amd.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=216865 Link: https://lore.kernel.org/lkml/15d0f9ff-2a56-b3e9-5b45-e6b23300ae3b@leemhuis.info/ Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-02-16iommu/vt-d: Allow to use flush-queue when first level is defaultTina Zhang
Commit 29b32839725f ("iommu/vt-d: Do not use flush-queue when caching-mode is on") forced default domains to be strict mode as long as IOMMU caching-mode is flagged. The reason for doing this is that when vIOMMU uses VT-d caching mode to synchronize shadowing page tables, the strict mode shows better performance. However, this optimization is orthogonal to the first-level page table because the Intel VT-d architecture does not define the caching mode of the first-level page table. Refer to VT-d spec, section 6.1, "When the CM field is reported as Set, any software updates to remapping structures other than first-stage mapping (including updates to not- present entries or present entries whose programming resulted in translation faults) requires explicit invalidation of the caches." Exclude the first-level page table from this optimization. Generally using first-stage translation in vIOMMU implies nested translation enabled in the physical IOMMU. In this case the first-stage page table is wholly captured by the guest. The vIOMMU only needs to transfer the cache invalidations on vIOMMU to the physical IOMMU. Forcing the default domain to strict mode will cause more frequent cache invalidations, resulting in performance degradation. In a real performance benchmark test measured by iperf receive, the performance result on Sapphire Rapids 100Gb NIC shows: w/ this fix ~51 Gbits/s, w/o this fix ~39.3 Gbits/s. Theoretically a first-stage IOMMU page table can still be shadowed in absence of the caching mode, e.g. with host write-protecting guest IOMMU page table to synchronize changed PTEs with the physical IOMMU page table. In this case the shadowing overhead is decoupled from emulating IOTLB invalidation then the overhead of the latter part is solely decided by the frequency of IOTLB invalidations. Hence allowing guest default dma domain to be lazy can also benefit the overall performance by reducing the total VM-exit numbers. Fixes: 29b32839725f ("iommu/vt-d: Do not use flush-queue when caching-mode is on") Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Tina Zhang <tina.zhang@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20230214025618.2292889-1-tina.zhang@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-02-16iommu/vt-d: Fix PASID directory pointer coherencyJacob Pan
On platforms that do not support IOMMU Extended capability bit 0 Page-walk Coherency, CPU caches are not snooped when IOMMU is accessing any translation structures. IOMMU access goes only directly to memory. Intel IOMMU code was missing a flush for the PASID table directory that resulted in the unrecoverable fault as shown below. This patch adds clflush calls whenever allocating and updating a PASID table directory to ensure cache coherency. On the reverse direction, there's no need to clflush the PASID directory pointer when we deactivate a context entry in that IOMMU hardware will not see the old PASID directory pointer after we clear the context entry. PASID directory entries are also never freed once allocated. DMAR: DRHD: handling fault status reg 3 DMAR: [DMA Read NO_PASID] Request device [00:0d.2] fault addr 0x1026a4000 [fault reason 0x51] SM: Present bit in Directory Entry is clear DMAR: Dump dmar1 table entries for IOVA 0x1026a4000 DMAR: scalable mode root entry: hi 0x0000000102448001, low 0x0000000101b3e001 DMAR: context entry: hi 0x0000000000000000, low 0x0000000101b4d401 DMAR: pasid dir entry: 0x0000000101b4e001 DMAR: pasid table entry[0]: 0x0000000000000109 DMAR: pasid table entry[1]: 0x0000000000000001 DMAR: pasid table entry[2]: 0x0000000000000000 DMAR: pasid table entry[3]: 0x0000000000000000 DMAR: pasid table entry[4]: 0x0000000000000000 DMAR: pasid table entry[5]: 0x0000000000000000 DMAR: pasid table entry[6]: 0x0000000000000000 DMAR: pasid table entry[7]: 0x0000000000000000 DMAR: PTE not present at level 4 Cc: <stable@vger.kernel.org> Fixes: 0bbeb01a4faf ("iommu/vt-d: Manage scalalble mode PASID tables") Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reported-by: Sukumar Ghorai <sukumar.ghorai@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/20230209212843.1788125-1-jacob.jun.pan@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-02-16iommu/vt-d: Avoid superfluous IOTLB tracking in lazy modeJacob Pan
Intel IOMMU driver implements IOTLB flush queue with domain selective or PASID selective invalidations. In this case there's no need to track IOVA page range and sync IOTLBs, which may cause significant performance hit. This patch adds a check to avoid IOVA gather page and IOTLB sync for the lazy path. The performance difference on Sapphire Rapids 100Gb NIC is improved by the following (as measured by iperf send): w/o this fix~48 Gbits/s. with this fix ~54 Gbits/s Cc: <stable@vger.kernel.org> Fixes: 2a2b8eaa5b25 ("iommu: Handle freelists when using deferred flushing in iommu drivers") Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/20230209175330.1783556-1-jacob.jun.pan@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-02-16iommu/vt-d: Fix error handling in sva enable/disable pathsLu Baolu
Roll back all previous actions in error paths of intel_iommu_enable_sva() and intel_iommu_disable_sva(). Fixes: d5b9e4bfe0d8 ("iommu/vt-d: Report prq to io-pgfault framework") Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20230208051559.700109-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-02-16iommu/amd: Improve page fault error reportingVasant Hegde
If IOMMU domain for device group is not setup properly then we may hit IOMMU page fault. Current page fault handler assumes that domain is always setup and it will hit NULL pointer derefence (see below sample log). Lets check whether domain is setup or not and log appropriate message. Sample log: ---------- amdgpu 0000:00:01.0: amdgpu: SE 1, SH per SE 1, CU per SH 8, active_cu_number 6 BUG: kernel NULL pointer dereference, address: 0000000000000058 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 2 PID: 56 Comm: irq/24-AMD-Vi Not tainted 6.2.0-rc2+ #89 Hardware name: xxx RIP: 0010:report_iommu_fault+0x11/0x90 [...] Call Trace: <TASK> amd_iommu_int_thread+0x60c/0x760 ? __pfx_irq_thread_fn+0x10/0x10 irq_thread_fn+0x1f/0x60 irq_thread+0xea/0x1a0 ? preempt_count_add+0x6a/0xa0 ? __pfx_irq_thread_dtor+0x10/0x10 ? __pfx_irq_thread+0x10/0x10 kthread+0xe9/0x110 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 </TASK> Reported-by: Matt Fagnani <matt.fagnani@bell.net> Suggested-by: Joerg Roedel <joro@8bytes.org> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216865 Link: https://lore.kernel.org/lkml/15d0f9ff-2a56-b3e9-5b45-e6b23300ae3b@leemhuis.info/ Link: https://lore.kernel.org/r/20230215052642.6016-3-vasant.hegde@amd.com Cc: stable@vger.kernel.org [joro: Edit commit message] Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-02-16iommu/amd: Do not identity map v2 capable device when snp is enabledVasant Hegde
Flow: - Booted system with SNP enabled, memory encryption off and IOMMU DMA translation mode - AMD driver detects v2 capable device and amd_iommu_def_domain_type() returns identity mode - amd_iommu_domain_alloc() returns NULL an SNP is enabled - System will fail to register device On SNP enabled system, passthrough mode is not supported. IOMMU default domain is set to translation mode. We need to return zero from amd_iommu_def_domain_type() so that it allocates translation domain. Fixes: fb2accadaa94 ("iommu/amd: Introduce function to check and enable SNP") CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20230207091752.7656-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-02-16iommu: Fix error unwind in iommu_group_alloc()Jason Gunthorpe
If either iommu_group_grate_file() fails then the iommu_group is leaked. Destroy it on these error paths. Found by kselftest/iommu/iommufd_fail_nth Fixes: bc7d12b91bd3 ("iommu: Implement reserved_regions iommu-group sysfs file") Fixes: c52c72d3dee8 ("iommu: Add sysfs attribyte for domain type") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/0-v1-8f616bee028d+8b-iommu_group_alloc_leak_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-02-16iommu/of: mark an unused function as __maybe_unusedRandy Dunlap
When CONFIG_OF_ADDRESS is not set, there is a build warning/error about an unused function. Annotate the function to quieten the warning/error. ../drivers/iommu/of_iommu.c:176:29: warning: 'iommu_resv_region_get_type' defined but not used [-Wunused-function] 176 | static enum iommu_resv_type iommu_resv_region_get_type(struct device *dev, struct resource *phys, | ^~~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: a5bf3cfce8cb ("iommu: Implement of_iommu_get_resv_regions()") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Thierry Reding <treding@nvidia.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Will Deacon <will@kernel.org> Cc: iommu@lists.linux.dev Reviewed-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20230209010359.23831-1-rdunlap@infradead.org [joro: Improve code formatting] Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-02-15iommufd: Do not add the same hwpt to the ioas->hwpt_list twiceJason Gunthorpe
The hwpt is added to the hwpt_list only during its creation, it is never added again. This hunk is some missed leftover from rework. Adding it twice will corrupt the linked list in some cases. It effects HWPT specific attachment, which is something the test suite cannot cover until we can create a legitimate struct device with a non-system iommu "driver" (ie we need the bus removed from the iommu code) Cc: stable@vger.kernel.org Fixes: e8d57210035b ("iommufd: Add kAPI toward external drivers for physical devices") Link: https://lore.kernel.org/r/1-v1-4336b5cb2fe4+1d7-iommufd_hwpt_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reported-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-02-14iommufd: Make sure to zero vfio_iommu_type1_info before copying to userJason Gunthorpe
Missed a zero initialization here. Most of the struct is filled with a copy_from_user(), however minsz for that copy is smaller than the actual struct by 8 bytes, thus we don't fill the padding. Cc: stable@vger.kernel.org # 6.1+ Fixes: d624d6652a65 ("iommufd: vfio container FD ioctl compatibility") Link: https://lore.kernel.org/r/0-v1-a74499ece799+1a-iommufd_get_info_leak_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reported-by: syzbot+cb1e0978f6bf46b83a58@syzkaller.appspotmail.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>