summaryrefslogtreecommitdiff
path: root/drivers/iommu/amd/iommu.c
AgeCommit message (Collapse)Author
2024-12-18iommu/amd: Lock DTE before updating the entry with WRITE_ONCE()Suravee Suthikulpanit
When updating only within a 64-bit tuple of a DTE, just lock the DTE and use WRITE_ONCE() because it is writing to memory read back by HW. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20241118054937.5203-9-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18iommu/amd: Modify clear_dte_entry() to avoid in-place updateSuravee Suthikulpanit
By reusing the make_clear_dte() and update_dte256(). Also, there is no need to set TV bit for non-SNP system when clearing DTE for blocked domain, and no longer need to apply erratum 63 in clear_dte() since it is already stored in struct ivhd_dte_flags and apply in set_dte_entry(). Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20241118054937.5203-8-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18iommu/amd: Introduce helper function get_dte256()Suravee Suthikulpanit
And use it in clone_alias() along with update_dte256(). Also use get_dte256() in dump_dte_entry(). Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20241118054937.5203-7-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18iommu/amd: Modify set_dte_entry() to use 256-bit DTE helpersSuravee Suthikulpanit
Also, the set_dte_entry() is used to program several DTE fields (e.g. stage1 table, stage2 table, domain id, and etc.), which is difficult to keep track with current implementation. Therefore, separate logic for clearing DTE (i.e. make_clear_dte) and another function for setting up the GCR3 Table Root Pointer, GIOV, GV, GLX, and GuestPagingMode into another function set_dte_gcr3_table(). Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20241118054937.5203-6-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-18iommu/amd: Introduce helper function to update 256-bit DTESuravee Suthikulpanit
The current implementation does not follow 128-bit write requirement to update DTE as specified in the AMD I/O Virtualization Techonology (IOMMU) Specification. Therefore, modify the struct dev_table_entry to contain union of u128 data array, and introduce a helper functions update_dte256() to update DTE using two 128-bit cmpxchg operations to update 256-bit DTE with the modified structure, and take into account the DTE[V, GV] bits when programming the DTE to ensure proper order of DTE programming and flushing. In addition, introduce a per-DTE spin_lock struct dev_data.dte_lock to provide synchronization when updating the DTE to prevent cmpxchg128 failure. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Suggested-by: Uros Bizjak <ubizjak@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20241118054937.5203-5-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-10iommu/amd: Add lockdep asserts for domain->dev_listJason Gunthorpe
Add an assertion to all the iteration points that don't obviously have the lock held already. These all take the locker higher in their call chains. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/2-v1-3b9edcf8067d+3975-amd_dev_list_locking_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-12-10iommu/amd: Put list_add/del(dev_data) back under the domain->lockJason Gunthorpe
The list domain->dev_list is protected by the domain->lock spinlock. Any iteration, addition or removal must be under the lock. Move the list_del() up into the critical section. pdom_is_sva_capable(), and destroy_gcr3_table() do not interact with the list element. Wrap the list_add() in a lock, it would make more sense if this was under the same critical section as adjusting the refcounts earlier, but that requires more complications. Fixes: d6b47dec3684 ("iommu/amd: Reduce domain lock scope in attach device path") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/1-v1-3b9edcf8067d+3975-amd_dev_list_locking_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-11-22iommu: Rename ops->domain_alloc_user() to domain_alloc_paging_flags()Jason Gunthorpe
Now that the main domain allocating path is calling this function it doesn't make sense to leave it named _user. Change the name to alloc_paging_flags() to mirror the new iommu_paging_domain_alloc_flags() function. A driver should implement only one of ops->domain_alloc_paging() or ops->domain_alloc_paging_flags(). The former is a simpler interface with less boiler plate that the majority of drivers use. The latter is for drivers with a greater feature set (PASID, multiple page table support, advanced iommufd support, nesting, etc). Additional patches will be needed to achieve this. Link: https://patch.msgid.link/r/2-v1-c252ebdeb57b+329-iommu_paging_flags_jgg@nvidia.com Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-10-30iommu/amd: Improve amd_iommu_release_device()Vasant Hegde
Previous patch added ops->release_domain support. Core will attach devices to release_domain->attach_dev() before calling this function. Devices are already detached their current domain and attached to blocked domain. This is mostly dummy function now. Just throw warning if device is still attached to domain. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-13-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Add ops->release_domainVasant Hegde
In release path, remove device from existing domain and attach it to blocked domain. So that all DMAs from device is blocked. Note that soon blocked_domain will support other ops like set_dev_pasid() but release_domain supports only attach_dev ops. Hence added separate 'release_domain' variable. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-12-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Reorder attach device codeVasant Hegde
Ideally in attach device path, it should take dev_data lock before making changes to device data including IOPF enablement. So far dev_data was using spinlock and it was hitting lock order issue when it tries to enable IOPF. Hence Commit 526606b0a199 ("iommu/amd: Fix Invalid wait context issue") moved IOPF enablement outside dev_data->lock. Previous patch converted dev_data lock to mutex. Now its safe to call amd_iommu_iopf_add_device() with dev_data->mutex. Hence move back PCI device capability enablement (ATS, PRI, PASID) and IOPF enablement code inside the lock. Also in attach_device(), update 'dev_data->domain' at the end so that error handling becomes simple. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-11-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Convert dev_data lock from spinlock to mutexVasant Hegde
Currently in attach device path it takes dev_data->spinlock. But as per design attach device path can sleep. Also if device is PRI capable then it adds device to IOMMU fault handler queue which takes mutex. Hence currently PRI enablement is done outside dev_data lock. Covert dev_data lock from spinlock to mutex so that it follows the design and also PRI enablement can be done properly. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-10-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Rearrange attach device codeVasant Hegde
attach_device() is just holding lock and calling do_attach(). There is not need to have another function. Just move do_attach() code to attach_device(). Similarly move do_detach() code to detach_device(). Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-9-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Reduce domain lock scope in attach device pathVasant Hegde
Currently attach device path takes protection domain lock followed by dev_data lock. Most of the operations in this function is specific to device data except pdom_attach_iommu() where it updates protection domain structure. Hence reduce the scope of protection domain lock. Note that this changes the locking order. Now it takes device lock before taking doamin lock (group->mutex -> dev_data->lock -> pdom->lock). dev_data->lock is used only in device attachment path. So changing order is fine. It will not create any issue. Finally move numa node assignment to pdom_attach_iommu(). Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-8-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Do not detach devices in domain free pathVasant Hegde
All devices attached to a protection domain must be freed before calling domain free. Hence do not try to free devices in domain free path. Continue to throw warning if pdom->dev_list is not empty so that any potential issues can be fixed. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-7-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: xarray to track protection_domain->iommu listVasant Hegde
Use xarray to track IOMMU attached to protection domain instead of static array of MAX_IOMMUS. Also add lockdep assertion. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-5-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Remove protection_domain.dev_cnt variableVasant Hegde
protection_domain->dev_list tracks list of attached devices to domain. We can use list_* functions on dev_list to get device count. Hence remove 'dev_cnt' variable. No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-4-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30iommu/amd: Use ida interface to manage protection domain IDVasant Hegde
Replace custom domain ID allocator with IDA interface. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241030063556.6104-3-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-30Merge branch 'core' into amd/amd-viJoerg Roedel
2024-10-29iommu/amd: Implement global identity domainVasant Hegde
Implement global identity domain. All device groups in identity domain will share this domain. In attach device path, based on device capability it will allocate per device domain ID and GCR3 table. So that it can support SVA. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20241028093810.5901-11-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29iommu/amd: Enhance amd_iommu_domain_alloc_user()Vasant Hegde
Previous patch enhanced core layer to check device PASID capability and pass right flags to ops->domain_alloc_user(). Enhance amd_iommu_domain_alloc_user() to allocate domain with appropriate page table based on flags parameter. - If flags is empty then allocate domain with default page table type. This will eventually replace ops->domain_alloc(). For UNMANAGED domain, core will call this interface with flags=0. So AMD driver will continue to allocate V1 page table. - If IOMMU_HWPT_ALLOC_PASID flags is passed then allocate domain with v2 page table. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241028093810.5901-10-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29iommu/amd: Pass page table type as param to pdom_setup_pgtable()Vasant Hegde
Current code forces v1 page table for UNMANAGED domain and global page table type (amd_iommu_pgtable) for rest of paging domain. Following patch series adds support for domain_alloc_paging() ops. Also enhances domain_alloc_user() to allocate page table based on 'flags. Hence pass page table type as parameter to pdomain_setup_pgtable(). So that caller can decide right page table type. Also update dma_max_address() to take pgtable as parameter. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jacob Pan <jacob.pan@linux.microsoft.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241028093810.5901-9-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-29iommu/amd: Separate page table setup from domain allocationVasant Hegde
Currently protection_domain_alloc() allocates domain and also sets up page table. Page table setup is required for PAGING domain only. Domain type like SVA doesn't need page table. Hence move page table setup code to separate function. Also SVA domain allocation path does not call pdom_setup_pgtable(). Hence remove IOMMU_DOMAIN_SVA type check. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jacob Pan <jacob.pan@linux.microsoft.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241028093810.5901-8-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-15iommu/amd: Use atomic64_inc_return() in iommu.cUros Bizjak
Use atomic64_inc_return(&ref) instead of atomic64_add_return(1, &ref) to use optimized implementation and ease register pressure around the primitive for targets that implement optimized variant. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20241007084356.47799-1-ubizjak@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-13Merge branches 'fixes', 'arm/smmu', 'intel/vt-d', 'amd/amd-vi' and 'core' ↵Joerg Roedel
into next
2024-09-12iommu/amd: Test for PAGING domains before freeing a domainJason Gunthorpe
This domain free function can be called for IDENTITY and SVA domains too, and they don't have page tables. For now protect against this by checking the type. Eventually the different types should have their own free functions. Fixes: 485534bfccb2 ("iommu/amd: Remove conditions from domain free paths") Reported-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/0-v1-ad9884ee5f5b+da-amd_iopgtbl_fix_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-12iommu/amd: Fix argument order in amd_iommu_dev_flush_pasid_all()Eliav Bar-ilan
An incorrect argument order calling amd_iommu_dev_flush_pasid_pages() causes improper flushing of the IOMMU, leaving the old value of GCR3 from a previous process attached to the same PASID. The function has the signature: void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data, ioasid_t pasid, u64 address, size_t size) Correct the argument order. Cc: stable@vger.kernel.org Fixes: 474bf01ed9f0 ("iommu/amd: Add support for device based TLB invalidation") Signed-off-by: Eliav Bar-ilan <eliavb@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/0-v1-fc6bc37d8208+250b-amd_pasid_flush_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Remove conditions from domain free pathsJason Gunthorpe
Don't use tlb as some flag to indicate if protection_domain_alloc() completed. Have protection_domain_alloc() unwind itself in the normal kernel style and require protection_domain_free() only be called on successful results of protection_domain_alloc(). Also, the amd_iommu_domain_free() op is never called by the core code with a NULL argument, so remove all the NULL tests as well. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/10-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Store the nid in io_pgtable_cfg instead of the domainJason Gunthorpe
We already have memory in the union here that is being wasted in AMD's case, use it to store the nid. Putting the nid here further isolates the io_pgtable code from the struct protection_domain. Fixup protection_domain_alloc so that the NID from the device is provided, at this point dev is never NULL for AMD so this will now allocate the first table pointer on the correct NUMA node. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/8-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Remove amd_io_pgtable::pgtbl_cfgJason Gunthorpe
This struct is already in iop.cfg, we don't need two. AMD is using this API sort of wrong, the cfg is supposed to be passed in and then the allocation function will allocate ops memory and copy the passed config into the new memory. Keep it kind of wrong and pass in the cfg memory that is already part of the pagetable struct. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/7-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Rename struct amd_io_pgtable iopt to pgtblJason Gunthorpe
There is struct protection_domain iopt and struct amd_io_pgtable iopt. Next patches are going to want to write domain.iopt.iopt.xx which is quite unnatural to read. Give one of them a different name, amd_io_pgtable has fewer references so call it pgtbl, to match pgtbl_cfg, instead. Suggested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Remove amd_iommu_domain_update() from page table freeingJason Gunthorpe
It is a serious bug if the domain is still mapped to any DTEs when it is freed as we immediately start freeing page table memory, so any remaining HW touch will UAF. If it is not mapped then dev_list is empty and amd_iommu_domain_update() does nothing. Remove it and add a WARN_ON() to catch this class of bug. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Set the pgsize_bitmap correctlyJason Gunthorpe
When using io_pgtable the correct pgsize_bitmap is stored in the cfg, both v1_alloc_pgtable() and v2_alloc_pgtable() set it correctly. This fixes a bug where the v2 pgtable had the wrong pgsize as protection_domain_init_v2() would set it and then do_iommu_domain_alloc() immediately resets it. Remove the confusing ops.pgsize_bitmap since that is not used if the driver sets domain.pgsize_bitmap. Fixes: 134288158a41 ("iommu/amd: Add domain_alloc_user based domain allocation") Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Move allocation of the top table into v1_alloc_pgtableJason Gunthorpe
All the page table memory should be allocated/free within the io_pgtable struct. The v2 path is already doing this, make it consistent. It is hard to see but the free of the root in protection_domain_free() is a NOP on the success path because v1_free_pgtable() does amd_iommu_domain_clr_pt_root(). The root memory is already freed because free_sub_pt() put it on the freelist. The free path in protection_domain_free() is only used during error unwind of protection_domain_alloc(). Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Make amd_iommu_dev_update_dte() staticVasant Hegde
As its used inside iommu.c only. Also rename function to dev_update_dte() as its static function. No functional changes intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20240828111029.5429-9-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Rework amd_iommu_update_and_flush_device_table()Vasant Hegde
Remove separate function to update and flush the device table as only amd_iommu_update_and_flush_device_table() calls these functions. No functional changes intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20240828111029.5429-8-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Make amd_iommu_domain_flush_complete() staticVasant Hegde
AMD driver uses amd_iommu_domain_flush_complete() function to make sure IOMMU processed invalidation commands before proceeding. Ideally this should be called from functions which updates DTE/invalidates caches. There is no need to call this function explicitly. This patches makes below changes : - Rename amd_iommu_domain_flush_complete() -> domain_flush_complete() and make it as static function. - Rearrage domain_flush_complete() to avoid forward declaration. - Update amd_iommu_update_and_flush_device_table() to call domain_flush_complete(). Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20240828111029.5429-7-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Make amd_iommu_dev_flush_pasid_all() staticVasant Hegde
As its not used outside iommu.c. Also rename it as dev_flush_pasid_all(). No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20240828111029.5429-6-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Handle error path in amd_iommu_probe_device()Vasant Hegde
Do not try to set max_pasids in error path as dev_data is not allocated. Fixes: a0c47f233e68 ("iommu/amd: Introduce iommu_dev_data.max_pasids") Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20240828111029.5429-5-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Make amd_iommu_is_attach_deferred() staticVasant Hegde
amd_iommu_is_attach_deferred() is a callback function called by iommu_ops. Make it as static. No functional changes intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20240828111029.5429-3-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Update event log pointer as soon as processing is completeVasant Hegde
Update event buffer head pointer once driver completes processing. So that IOMMU can write new log without waiting for driver to complete processing all event logs. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20240828111029.5429-2-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-08-30iommu: Allow ATS to work on VFs when the PF uses IDENTITYJason Gunthorpe
PCI ATS has a global Smallest Translation Unit field that is located in the PF but shared by all of the VFs. The expectation is that the STU will be set to the root port's global STU capability which is driven by the IO page table configuration of the iommu HW. Today it becomes set when the iommu driver first enables ATS. Thus, to enable ATS on the VF, the PF must have already had the correct STU programmed, even if ATS is off on the PF. Unfortunately the PF only programs the STU when the PF enables ATS. The iommu drivers tend to leave ATS disabled when IDENTITY translation is being used. Thus we can get into a state where the PF is setup to use IDENTITY with the DMA API while the VF would like to use VFIO with a PAGING domain and have ATS turned on. This fails because the PF never loaded a PAGING domain and so it never setup the STU, and the VF can't do it. The simplest solution is to have the iommu driver set the ATS STU when it probes the device. This way the ATS STU is loaded immediately at boot time to all PFs and there is no issue when a VF comes to use it. Add a new call pci_prepare_ats() which should be called by iommu drivers in their probe_device() op for every PCI device if the iommu driver supports ATS. This will setup the STU based on whatever page size capability the iommu HW has. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/0-v1-0fb4d2ab6770+7e706-ats_vf_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-08-13iommu/amd: Add blocked domain supportVasant Hegde
Create global blocked domain with attach device ops. It will clear the DTE so that all DMA from device will be aborted. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240722115452.5976-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-06-27iommu/amd: Invalidate cache before removing device from domain listVasant Hegde
Commit 87a6f1f22c97 ("iommu/amd: Introduce per-device domain ID to fix potential TLB aliasing issue") introduced per device domain ID when domain is configured with v2 page table. And in invalidation path, it uses per device structure (dev_data->gcr3_info.domid) to get the domain ID. In detach_device() path, current code tries to invalidate IOMMU cache after removing dev_data from domain device list. This means when domain is configured with v2 page table, amd_iommu_domain_flush_all() will not be able to invalidate cache as device is already removed from domain device list. This is causing change domain tests (changing domain type from identity to DMA) to fail with IO_PAGE_FAULT issue. Hence invalidate cache and update DTE before updating data structures. Reported-by: FahHean Lee <fahhean.lee@amd.com> Reported-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com> Fixes: 87a6f1f22c97 ("iommu/amd: Introduce per-device domain ID to fix potential TLB aliasing issue") Tested-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com> Tested-by: Sairaj Arun Kodilkar <sairaj.arunkodilkar@amd.com> Tested-by: FahHean Lee <fahhean.lee@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/20240620060552.13984-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-06-04iommu/amd: Fix Invalid wait context issueVasant Hegde
With commit c4cb23111103 ("iommu/amd: Add support for enable/disable IOPF") we are hitting below issue. This happens because in IOPF enablement path it holds spin lock with irq disable and then tries to take mutex lock. dmesg: ----- [ 0.938739] ============================= [ 0.938740] [ BUG: Invalid wait context ] [ 0.938742] 6.10.0-rc1+ #1 Not tainted [ 0.938745] ----------------------------- [ 0.938746] swapper/0/1 is trying to lock: [ 0.938748] ffffffff8c9f01d8 (&port_lock_key){....}-{3:3}, at: serial8250_console_write+0x78/0x4a0 [ 0.938767] other info that might help us debug this: [ 0.938768] context-{5:5} [ 0.938769] 7 locks held by swapper/0/1: [ 0.938772] #0: ffff888101a91310 (&group->mutex){+.+.}-{4:4}, at: bus_iommu_probe+0x70/0x160 [ 0.938790] #1: ffff888101d1f1b8 (&domain->lock){....}-{3:3}, at: amd_iommu_attach_device+0xa5/0x700 [ 0.938799] #2: ffff888101cc3d18 (&dev_data->lock){....}-{3:3}, at: amd_iommu_attach_device+0xc5/0x700 [ 0.938806] #3: ffff888100052830 (&iommu->lock){....}-{2:2}, at: amd_iommu_iopf_add_device+0x3f/0xa0 [ 0.938813] #4: ffffffff8945a340 (console_lock){+.+.}-{0:0}, at: _printk+0x48/0x50 [ 0.938822] #5: ffffffff8945a390 (console_srcu){....}-{0:0}, at: console_flush_all+0x58/0x4e0 [ 0.938867] #6: ffffffff82459f80 (console_owner){....}-{0:0}, at: console_flush_all+0x1f0/0x4e0 [ 0.938872] stack backtrace: [ 0.938874] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.10.0-rc1+ #1 [ 0.938877] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.39 04/16/2019 Fix above issue by re-arranging code in attach device path: - move device PASID/IOPF enablement outside lock in AMD IOMMU driver. This is safe as core layer holds group->mutex lock before calling iommu_ops->attach_dev. Reported-by: Borislav Petkov <bp@alien8.de> Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com> Fixes: c4cb23111103 ("iommu/amd: Add support for enable/disable IOPF") Tested-by: Borislav Petkov <bp@alien8.de> Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20240530084801.10758-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-05-21Merge tag 'pci-v6.10-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Skip E820 checks for MCFG ECAM regions for new (2016+) machines, since there's no requirement to describe them in E820 and some platforms require ECAM to work (Bjorn Helgaas) - Rename PCI_IRQ_LEGACY to PCI_IRQ_INTX to be more specific (Damien Le Moal) - Remove last user and pci_enable_device_io() (Heiner Kallweit) - Wait for Link Training==0 to avoid possible race (Ilpo Järvinen) - Skip waiting for devices that have been disconnected while suspended (Ilpo Järvinen) - Clear Secondary Status errors after enumeration since Master Aborts and Unsupported Request errors are an expected part of enumeration (Vidya Sagar) MSI: - Remove unused IMS (Interrupt Message Store) support (Bjorn Helgaas) Error handling: - Mask Genesys GL975x SD host controller Replay Timer Timeout correctable errors caused by a hardware defect; the errors cause interrupts that prevent system suspend (Kai-Heng Feng) - Fix EDR-related _DSM support, which previously evaluated revision 5 but assumed revision 6 behavior (Kuppuswamy Sathyanarayanan) ASPM: - Simplify link state definitions and mask calculation (Ilpo Järvinen) Power management: - Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports, where BIOS apparently doesn't know how to put them back in D0 (Mario Limonciello) CXL: - Support resetting CXL devices; special handling required because CXL Ports mask Secondary Bus Reset by default (Dave Jiang) DOE: - Support DOE Discovery Version 2 (Alexey Kardashevskiy) Endpoint framework: - Set endpoint BAR to be 64-bit if the driver says that's all the device supports, in addition to doing so if the size is >2GB (Niklas Cassel) - Simplify endpoint BAR allocation and setting interfaces (Niklas Cassel) Cadence PCIe controller driver: - Drop DT binding redundant msi-parent and pci-bus.yaml (Krzysztof Kozlowski) Cadence PCIe endpoint driver: - Configure endpoint BARs to be 64-bit based on the BAR type, not the BAR value (Niklas Cassel) Freescale Layerscape PCIe controller driver: - Convert DT binding to YAML (Frank Li) MediaTek MT7621 PCIe controller driver: - Add DT binding missing 'reg' property for child Root Ports (Krzysztof Kozlowski) - Fix theoretical string truncation in PHY name (Sergio Paracuellos) NVIDIA Tegra194 PCIe controller driver: - Return success for endpoint probe instead of falling through to the failure path (Vidya Sagar) Renesas R-Car PCIe controller driver: - Add DT binding missing IOMMU properties (Geert Uytterhoeven) - Add DT binding R-Car V4H compatible for host and endpoint mode (Yoshihiro Shimoda) Rockchip PCIe controller driver: - Configure endpoint BARs to be 64-bit based on the BAR type, not the BAR value (Niklas Cassel) - Add DT binding missing maxItems to ep-gpios (Krzysztof Kozlowski) - Set the Subsystem Vendor ID, which was previously zero because it was masked incorrectly (Rick Wertenbroek) Synopsys DesignWare PCIe controller driver: - Restructure DBI register access to accommodate devices where this requires Refclk to be active (Manivannan Sadhasivam) - Remove the deinit() callback, which was only need by the pcie-rcar-gen4, and do it directly in that driver (Manivannan Sadhasivam) - Add dw_pcie_ep_cleanup() so drivers that support PERST# can clean up things like eDMA (Manivannan Sadhasivam) - Rename dw_pcie_ep_exit() to dw_pcie_ep_deinit() to make it parallel to dw_pcie_ep_init() (Manivannan Sadhasivam) - Rename dw_pcie_ep_init_complete() to dw_pcie_ep_init_registers() to reflect the actual functionality (Manivannan Sadhasivam) - Call dw_pcie_ep_init_registers() directly from all the glue drivers, not just those that require active Refclk from the host (Manivannan Sadhasivam) - Remove the "core_init_notifier" flag, which was an obscure way for glue drivers to indicate that they depend on Refclk from the host (Manivannan Sadhasivam) TI J721E PCIe driver: - Add DT binding J784S4 SoC Device ID (Siddharth Vadapalli) - Add DT binding J722S SoC support (Siddharth Vadapalli) TI Keystone PCIe controller driver: - Add DT binding missing num-viewport, phys and phy-name properties (Jan Kiszka) Miscellaneous: - Constify and annotate with __ro_after_init (Heiner Kallweit) - Convert DT bindings to YAML (Krzysztof Kozlowski) - Check for kcalloc() failure in of_pci_prop_intr_map() (Duoming Zhou)" * tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (97 commits) PCI: Do not wait for disconnected devices when resuming x86/pci: Skip early E820 check for ECAM region PCI: Remove unused pci_enable_device_io() ata: pata_cs5520: Remove unnecessary call to pci_enable_device_io() PCI: Update pci_find_capability() stub return types PCI: Remove PCI_IRQ_LEGACY scsi: vmw_pvscsi: Do not use PCI_IRQ_LEGACY instead of PCI_IRQ_LEGACY scsi: pmcraid: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: mpt3sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: megaraid_sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: ipr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: hpsa: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: arcmsr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY wifi: rtw89: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY dt-bindings: PCI: rockchip,rk3399-pcie: Add missing maxItems to ep-gpios Revert "genirq/msi: Provide constants for PCI/IMS support" Revert "x86/apic/msi: Enable PCI/IMS" Revert "iommu/vt-d: Enable PCI/IMS" Revert "iommu/amd: Enable PCI/IMS" Revert "PCI/MSI: Provide IMS (Interrupt Message Store) support" ...
2024-05-15Revert "iommu/amd: Enable PCI/IMS"Bjorn Helgaas
This reverts commit fa5745aca1dc819aee6463a2475b5c277f7cf8f6. IMS (Interrupt Message Store) support appeared in v6.2, but there are no users yet. Remove it for now. We can add it back when a user comes along. Link: https://lore.kernel.org/r/20240410221307.2162676-5-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2024-05-13Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'core' and 'x86/vt-d' ↵Joerg Roedel
into next
2024-04-26Merge branch 'memory-observability' into x86/amdJoerg Roedel
2024-04-26Merge branch 'iommu/fixes' into x86/amdJoerg Roedel