summaryrefslogtreecommitdiff
path: root/drivers/iommu/iommu-pages.h
AgeCommit message (Collapse)Author
2025-04-17iommu/vtd: Remove iommu_alloc_pages_node()Jason Gunthorpe
Intel is the only thing that uses this now, convert to the size versions, trying to avoid PAGE_SHIFT. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/23-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17iommu/pages: Remove iommu_alloc_page_node()Jason Gunthorpe
Use iommu_alloc_pages_node_sz() instead. AMD and Intel are both using 4K pages for these structures since those drivers only work on 4K PAGE_SIZE. riscv is also spec'd to use SZ_4K. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/21-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17iommu/pages: Remove iommu_alloc_page/pages()Jason Gunthorpe
A few small changes to the remaining drivers using these will allow them to be removed: - Exynos wants to allocate fixed 16K/8K allocations - Rockchip already has a define SPAGE_SIZE which is used by the dma_map immediately following, using SPAGE_ORDER which is a lg2size - tegra has size constants already for its two allocations Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17iommu/pages: Allow sub page sizes to be passed into the allocatorJason Gunthorpe
Generally drivers have a specific idea what their HW structure size should be. In a lot of cases this is related to PAGE_SIZE, but not always. ARM64, for example, allows a 4K IO page table size on a 64K CPU page table system. Currently we don't have any good support for sub page allocations, but make the API accommodate this by accepting a sub page size from the caller and rounding up internally. This is done by moving away from order as the size input and using size: size == 1 << (order + PAGE_SHIFT) Following patches convert drivers away from using order and try to specify allocation sizes independent of PAGE_SIZE. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/15-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17iommu/pages: Move from struct page to struct ioptdesc and folioJason Gunthorpe
This brings the iommu page table allocator into the modern world of having its own private page descriptor and not re-using fields from struct page for its own purpose. It follows the basic pattern of struct ptdesc which did this transformation for the CPU page table allocator. Currently iommu-pages is pretty basic so this isn't a huge benefit, however I see a coming need for features that CPU allocator has, like sub PAGE_SIZE allocations, and RCU freeing. This provides the base infrastructure to implement those cleanly. Remove numa_node_id() calls from the inlines and instead use NUMA_NO_NODE which will get switched to numa_mem_id(), which seems to be the right ID to use for memory allocations. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/13-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17iommu/pages: Remove iommu_put_pages_list_old and the _GenericJason Gunthorpe
Nothing uses the old list_head path now, remove it. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/12-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17iommu/pages: Formalize the freelist APIJason Gunthorpe
We want to get rid of struct page references outside the internal allocator implementation. The free list has the driver open code something like: list_add_tail(&virt_to_page(ptr)->lru, freelist); Move the above into a small inline and make the freelist into a wrapper type 'struct iommu_pages_list' so that the compiler can help check all the conversion. This struct has also proven helpful in some future ideas to convert to a singly linked list to get an extra pointer in the struct page, and to signal that the pages should be freed with RCU. Use a temporary _Generic so we don't need to rename the free function as the patches progress. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/8-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17iommu/pages: De-inline the substantial functionsJason Gunthorpe
These are called in a lot of places and are not trivial. Move them to the core module. Tidy some of the comments and function arguments, fold __iommu_alloc_account() into its only caller, change __iommu_free_account() into __iommu_free_page() to remove some duplication. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/7-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17iommu/pages: Remove iommu_free_page()Jason Gunthorpe
Use iommu_free_pages() instead. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17iommu/pages: Remove the order argument to iommu_free_pages()Jason Gunthorpe
Now that we have a folio under the allocation iommu_free_pages() can know the order of the original allocation and do the correct thing to free it. The next patch will rename iommu_free_page() to iommu_free_pages() so we have naming consistency with iommu_alloc_pages_node(). Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17iommu/pages: Make iommu_put_pages_list() work with high order allocationsJason Gunthorpe
alloc_pages_node(, order) needs to be paired with __free_pages(, order) to free all the allocated pages. For order != 0 the return from alloc_pages_node() is just a page list, it hasn't been formed into a folio. However iommu_put_pages_list() just calls put_page() on the head page of an allocation, which will end up leaking the tail pages if order != 0. Fix this by using __GFP_COMP to create a high order folio and then always use put_page() to free the full high order folio. __iommu_free_account() can get the order of the allocation via folio_order(), which corrects the accounting of high order allocations in iommu_put_pages_list(). This is the same technique slub uses. As far as I can tell, none of the places using high order allocations are also using the free list, so this not a current bug. Fixes: 06c375053cef ("iommu/vt-d: add wrapper functions for page allocations") Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-17iommu/pages: Remove __iommu_alloc_pages()/__iommu_free_pages()Jason Gunthorpe
These were only used by tegra-smmu and leaked the struct page out of the API. Delete them since tega-smmu has been converted to the other APIs. In the process flatten the call tree so we have fewer one line functions calling other one line functions.. iommu_alloc_pages_node() is the real allocator and everything else can just call it directly. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu: account IOMMU allocated memoryPasha Tatashin
In order to be able to limit the amount of memory that is allocated by IOMMU subsystem, the memory must be accounted. Account IOMMU as part of the secondary pagetables as it was discussed at LPC. The value of SecPageTables now contains mmeory allocation by IOMMU and KVM. There is a difference between GFP_ACCOUNT and what NR_IOMMU_PAGES shows. GFP_ACCOUNT is set only where it makes sense to charge to user processes, i.e. IOMMU Page Tables, but there more IOMMU shared data that should not really be charged to a specific process. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-12-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu: observability of the IOMMU allocationsPasha Tatashin
Add NR_IOMMU_PAGES into node_stat_item that counts number of pages that are allocated by the IOMMU subsystem. The allocations can be view per-node via: /sys/devices/system/node/nodeN/vmstat. For example: $ grep iommu /sys/devices/system/node/node*/vmstat /sys/devices/system/node/node0/vmstat:nr_iommu_pages 106025 /sys/devices/system/node/node1/vmstat:nr_iommu_pages 3464 The value is in page-count, therefore, in the above example the iommu allocations amount to ~428M. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-11-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu/vt-d: add wrapper functions for page allocationsPasha Tatashin
In order to improve observability and accountability of IOMMU layer, we must account the number of pages that are allocated by functions that are calling directly into buddy allocator. This is achieved by first wrapping the allocation related functions into a separate inline functions in new file: drivers/iommu/iommu-pages.h Convert all page allocation calls under iommu/intel to use these new functions. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-2-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>