summaryrefslogtreecommitdiff
path: root/drivers/iommu
AgeCommit message (Collapse)Author
2025-06-27iommu/rockchip: prevent iommus dead loop when two masters share one IOMMUSimon Xue
When two masters share an IOMMU, calling ops->of_xlate during the second master's driver init may overwrite iommu->domain set by the first. This causes the check if (iommu->domain == domain) in rk_iommu_attach_device() to fail, resulting in the same iommu->node being added twice to &rk_domain->iommus, which can lead to an infinite loop in subsequent &rk_domain->iommus operations. Cc: <stable@vger.kernel.org> Fixes: 25c2325575cc ("iommu/rockchip: Add missing set_platform_dma_ops callback") Signed-off-by: Simon Xue <xxm@rock-chips.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20250623020018.584802-1-xxm@rock-chips.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-27iommu/apple-dart: Drop default ARCH_APPLE in KconfigSven Peter
When the first driver for Apple Silicon was upstreamed we accidentally included `default ARCH_APPLE` in its Kconfig which then spread to almost every subsequent driver. As soon as ARCH_APPLE is set to y this will pull in many drivers as built-ins which is not what we want. Thus, drop `default ARCH_APPLE` from Kconfig. Signed-off-by: Sven Peter <sven@kernel.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Janne Grunau <j@jannau.net> Link: https://lore.kernel.org/r/20250612-apple-kconfig-defconfig-v1-7-0e6f9cb512c1@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-27iommu/qcom: Remove iommu_ops pgsize_bitmapJason Gunthorpe
This driver just uses a constant, put it in domain_alloc_paging and use the domain's value instead of ops during init_domain. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/6-v2-68a2e1ba507c+1fb-iommu_rm_ops_pgsize_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-27iommu/mtk: Remove iommu_ops pgsize_bitmapJason Gunthorpe
This driver just uses a constant, put it in domain_alloc_paging and use the domain's value instead of ops during finalise. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/5-v2-68a2e1ba507c+1fb-iommu_rm_ops_pgsize_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-27iommu: Remove iommu_ops pgsize_bitmap from simple driversJason Gunthorpe
These drivers just have a constant value for their page size, move it into their domain_alloc_paging function before setting up the geometry. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Niklas Schnelle <schnelle@linux.ibm.com> # for s390-iommu.c Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> # for exynos-iommu.c Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Acked-by: Thierry Reding <treding@nvidia.com> Acked-by: Chen-Yu Tsai <wens@csie.org> # sun50i-iommu.c Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/4-v2-68a2e1ba507c+1fb-iommu_rm_ops_pgsize_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-27iommu: Remove ops.pgsize_bitmap from drivers that don't use itJason Gunthorpe
These drivers all set the domain->pgsize_bitmap in their domain_alloc_paging() functions, so the ops value is never used. Delete it. Reviewed-by: Sven Peter <sven@svenpeter.dev> # for Apple DART Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Tomasz Jeznach <tjeznach@rivosinc.com> # for RISC-V Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/3-v2-68a2e1ba507c+1fb-iommu_rm_ops_pgsize_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-27iommu/arm-smmu: Remove iommu_ops pgsize_bitmapJason Gunthorpe
The driver never reads this value, arm_smmu_init_domain_context() always sets domain.pgsize_bitmap to smmu->pgsize_bitmap, the per-instance value. Remove the ops version entirely, the related dead code and make arm_smmu_ops const. Since this driver does not yet finalize the domain under arm_smmu_domain_alloc_paging() add a page size initialization to alloc so the page size is still setup prior to attach. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/2-v2-68a2e1ba507c+1fb-iommu_rm_ops_pgsize_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-27qiommu/arm-smmu-v3: Remove iommu_ops pgsize_bitmapJason Gunthorpe
The driver never reads this value, arm_smmu_domain_finalise() always sets domain.pgsize_bitmap to pgtbl_cfg, which comes from the per-smmu calculated value. Remove the ops version entirely, the related dead code and make arm_smmu_ops const. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/1-v2-68a2e1ba507c+1fb-iommu_rm_ops_pgsize_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-27iommu/amd: Add efr[HATS] max v1 page table levelAnkit Soni
The EFR[HATS] bits indicate maximum host translation level supported by IOMMU. Adding support to set the maximum host page table level as indicated by EFR[HATS]. If the HATS=11b (reserved), the driver will attempt to use guest page table for DMA API. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Ankit Soni <Ankit.Soni@amd.com> Link: https://lore.kernel.org/r/df0f8562c2a20895cc185c86f1a02c4d826fd597.1749016436.git.Ankit.Soni@amd.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-27iommu/amd: Add HATDis feature supportAnkit Soni
Current AMD IOMMU assumes Host Address Translation (HAT) is always supported, and Linux kernel enables this capability by default. However, in case of emulated and virtualized IOMMU, this might not be the case. For example,current QEMU-emulated AMD vIOMMU does not support host translation for VFIO pass-through device, but the interrupt remapping support is required for x2APIC (i.e. kvm-msi-ext-dest-id is also not supported by the guest OS). This would require the guest kernel to boot with guest kernel option iommu=pt to by-pass the initialization of host (v1) table. The AMD I/O Virtualization Technology (IOMMU) Specification Rev 3.10 [1] introduces a new flag 'HATDis' in the IVHD 11h IOMMU attributes to indicate that HAT is not supported on a particular IOMMU instance. Therefore, modifies the AMD IOMMU driver to detect the new HATDis attributes, and disable host translation and switch to use guest translation if it is available. Otherwise, the driver will disable DMA translation. [1] https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/specifications/48882_IOMMU.pdf Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Ankit Soni <Ankit.Soni@amd.com> Link: https://lore.kernel.org/r/8109b208f87b80e400c2abd24a2e44fcbc0763a5.1749016436.git.Ankit.Soni@amd.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-23iommu/amd: KVM: SVM: Allow KVM to control need for GA log interruptsSean Christopherson
Add plumbing to the AMD IOMMU driver to allow KVM to control whether or not an IRTE is configured to generate GA log interrupts. KVM only needs a notification if the target vCPU is blocking, so the vCPU can be awakened. If a vCPU is preempted or exits to userspace, KVM clears is_run, but will set the vCPU back to running when userspace does KVM_RUN and/or the vCPU task is scheduled back in, i.e. KVM doesn't need a notification. Unconditionally pass "true" in all KVM paths to isolate the IOMMU changes from the KVM changes insofar as possible. Opportunistically swap the ordering of parameters for amd_iommu_update_ga() so that the match amd_iommu_activate_guest_mode(). Note, as of this writing, the AMD IOMMU manual doesn't list GALogIntr as a non-cached field, but per AMD hardware architects, it's not cached and can be safely updated without an invalidation. Link: https://lore.kernel.org/all/b29b8c22-2fd4-4b5e-b755-9198874157c7@amd.com Cc: Vasant Hegde <vasant.hegde@amd.com> Cc: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20250611224604.313496-62-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu/amd: WARN if KVM calls GA IRTE helpers without virtual APIC supportSean Christopherson
WARN if KVM attempts to update IRTE entries when virtual APIC isn't fully supported, as KVM should guard all such calls on IRQ posting being enabled. Link: https://lore.kernel.org/r/20250611224604.313496-58-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu/amd: KVM: SVM: Add IRTE metadata to affined vCPU's list if AVIC is ↵Sean Christopherson
inhibited If an IRQ can be posted to a vCPU, but AVIC is currently inhibited on the vCPU, go through the dance of "affining" the IRTE to the vCPU, but leave the actual IRTE in remapped mode. KVM already handles the case where AVIC is inhibited => uninhibited with posted IRQs (see avic_set_pi_irte_mode()), but doesn't handle the scenario where a postable IRQ comes along while AVIC is inhibited. Link: https://lore.kernel.org/r/20250611224604.313496-45-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu/amd: KVM: SVM: Set pCPU info in IRTE when setting vCPU affinitySean Christopherson
Now that setting vCPU affinity is guarded with ir_list_lock, i.e. now that avic_physical_id_entry can be safely accessed, set the pCPU info straight-away when setting vCPU affinity. Putting the IRTE into posted mode, and then immediately updating the IRTE a second time if the target vCPU is running is wasteful and confusing. This also fixes a flaw where a posted IRQ that arrives between putting the IRTE into guest_mode and setting the correct destination could cause the IOMMU to ring the doorbell on the wrong pCPU. Link: https://lore.kernel.org/r/20250611224604.313496-44-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu/amd: Factor out helper for manipulating IRTE GA/CPU infoSean Christopherson
Split the guts of amd_iommu_update_ga() to a dedicated helper so that the logic can be shared with flows that put the IRTE into posted mode. Opportunistically move amd_iommu_update_ga() and its new helper above amd_iommu_activate_guest_mode() so that it's all co-located. Link: https://lore.kernel.org/r/20250611224604.313496-43-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu/amd: KVM: SVM: Infer IsRun from validity of pCPU destinationSean Christopherson
Infer whether or not a vCPU should be marked running from the validity of the pCPU on which it is running. amd_iommu_update_ga() already skips the IRTE update if the pCPU is invalid, i.e. passing %true for is_run with an invalid pCPU would be a blatant and egregrious KVM bug. Tested-by: Sairaj Kodilkar <sarunkod@amd.com> Link: https://lore.kernel.org/r/20250611224604.313496-42-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu/amd: Document which IRTE fields amd_iommu_update_ga() can modifySean Christopherson
Add a comment to amd_iommu_update_ga() to document what fields it can safely modify without issuing an invalidation of the IRTE, and to explain its role in keeping GA IRTEs up-to-date. Per page 93 of the IOMMU spec dated Feb 2025: When virtual interrupts are enabled by setting MMIO Offset 0018h[GAEn] and IRTE[GuestMode=1], IRTE[IsRun], IRTE[Destination], and if present IRTE[GATag], are not cached by the IOMMU. Modifications to these fields do not require an invalidation of the Interrupt Remapping Table. Link: https://lore.kernel.org/all/9b7ceea3-8c47-4383-ad9c-1a9bbdc9044a@oracle.com Cc: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20250611224604.313496-41-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu: KVM: Split "struct vcpu_data" into separate AMD vs. Intel structsSean Christopherson
Split the vcpu_data structure that serves as a handoff from KVM to IOMMU drivers into vendor specific structures. Overloading a single structure makes the code hard to read and maintain, is *very* misleading as it suggests that mixing vendors is actually supported, and bastardizing Intel's posted interrupt descriptor address when AMD's IOMMU already has its own structure is quite unnecessary. Tested-by: Sairaj Kodilkar <sarunkod@amd.com> Link: https://lore.kernel.org/r/20250611224604.313496-33-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu/amd: KVM: SVM: Pass NULL @vcpu_info to indicate "not guest mode"Sean Christopherson
Pass NULL to amd_ir_set_vcpu_affinity() to communicate "don't post to a vCPU" now that there's no need to communicate information back to KVM about the previous vCPU (KVM does its own tracking). Tested-by: Sairaj Kodilkar <sarunkod@amd.com> Link: https://lore.kernel.org/r/20250611224604.313496-24-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu/amd: KVM: SVM: Use pi_desc_addr to derive ga_root_ptrSean Christopherson
Use vcpu_data.pi_desc_addr instead of amd_iommu_pi_data.base to get the GA root pointer. KVM is the only source of amd_iommu_pi_data.base, and KVM's one and only path for writing amd_iommu_pi_data.base computes the exact same value for vcpu_data.pi_desc_addr and amd_iommu_pi_data.base, and fills amd_iommu_pi_data.base if and only if vcpu_data.pi_desc_addr is valid, i.e. amd_iommu_pi_data.base is fully redundant. Cc: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Sairaj Kodilkar <sarunkod@amd.com> Link: https://lore.kernel.org/r/20250611224604.313496-23-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20iommu/amd: KVM: SVM: Delete now-unused cached/previous GA tag fieldsSean Christopherson
Delete the amd_ir_data.prev_ga_tag field now that all usage is superfluous. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Sairaj Kodilkar <sarunkod@amd.com> Link: https://lore.kernel.org/r/20250611224604.313496-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-13iommu/tegra: Fix incorrect size calculationJason Gunthorpe
This driver uses a mixture of ways to get the size of a PTE, tegra_smmu_set_pde() did it as sizeof(*pd) which became wrong when pd switched to a struct tegra_pd. Switch pd back to a u32* in tegra_smmu_set_pde() so the sizeof(*pd) returns 4. Fixes: 50568f87d1e2 ("iommu/terga: Do not use struct page as the handle for as->pd memory") Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Closes: https://lore.kernel.org/all/62e7f7fe-6200-4e4f-ad42-d58ad272baa6@tecnico.ulisboa.pt/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Link: https://lore.kernel.org/r/0-v1-da7b8b3d57eb+ce-iommu_terga_sizeof_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-08treewide, timers: Rename from_timer() to timer_container_of()Ingo Molnar
Move this API to the canonical timer_*() namespace. [ tglx: Redone against pre rc1 ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
2025-06-04Merge tag 'pci-v6.16-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Print the actual delay time in pci_bridge_wait_for_secondary_bus() instead of assuming it was 1000ms (Wilfred Mallawa) - Revert 'iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices', which broke resume from system sleep on AMD platforms and has been fixed by other commits (Lukas Wunner) Resource management: - Remove mtip32xx use of pcim_iounmap_regions(), which is deprecated and unnecessary (Philipp Stanner) - Remove pcim_iounmap_regions() and pcim_request_region_exclusive() and related flags since all uses have been removed (Philipp Stanner) - Rework devres 'request' functions so they are no longer 'hybrid', i.e., their behavior no longer depends on whether pcim_enable_device or pci_enable_device() was used, and remove related code (Philipp Stanner) - Warn (not BUG()) about failure to assign optional resources (Ilpo Järvinen) Error handling: - Log the DPC Error Source ID only when it's actually valid (when ERR_FATAL or ERR_NONFATAL was received from a downstream device) and decode into bus/device/function (Bjorn Helgaas) - Determine AER log level once and save it so all related messages use the same level (Karolina Stolarek) - Use KERN_WARNING, not KERN_ERR, when logging PCIe Correctable Errors (Karolina Stolarek) - Ratelimit PCIe Correctable and Non-Fatal error logging, with sysfs controls on interval and burst count, to avoid flooding logs and RCU stall warnings (Jon Pan-Doh) Power management: - Increment PM usage counter when probing reset methods so we don't try to read config space of a powered-off device (Alex Williamson) - Set all devices to D0 during enumeration to ensure ACPI opregion is connected via _REG (Mario Limonciello) Power control: - Rename pwrctrl Kconfig symbols from 'PWRCTL' to 'PWRCTRL' to match the filename paths. Retain old deprecated symbols for compatibility, except for the pwrctrl slot driver (PCI_PWRCTRL_SLOT) (Johan Hovold) - When unregistering pwrctrl, cancel outstanding rescan work before cleaning up data structures to avoid use-after-free issues (Brian Norris) Bandwidth control: - Simplify link bandwidth controller by replacing the count of Link Bandwidth Management Status (LBMS) events with a PCI_LINK_LBMS_SEEN flag (Ilpo Järvinen) - Update the Link Speed after retraining, since the Link Speed may have changed (Ilpo Järvinen) PCIe native device hotplug: - Ignore Presence Detect Changed caused by DPC. pciehp already ignores Link Down/Up events caused by DPC, but on slots using in-band presence detect, DPC causes a spurious Presence Detect Changed event (Lukas Wunner) - Ignore Link Down/Up caused by Secondary Bus Reset. On hotplug ports using in-band presence detect, the reset causes a Presence Detect Changed event, which mistakenly caused teardown and re-enumeration of the device. Drivers may need to annotate code that resets their device (Lukas Wunner) Virtualization: - Add an ACS quirk for Loongson Root Ports that don't advertise ACS but don't allow peer-to-peer transactions between Root Ports; the quirk allows each Root Port to be in a separate IOMMU group (Huacai Chen) Endpoint framework: - For fixed-size BARs, retain both the actual size and the possibly larger size allocated to accommodate iATU alignment requirements (Jerome Brunet) - Simplify ctrl/SPAD space allocation and avoid allocating more space than needed (Jerome Brunet) - Correct MSI-X PBA offset calculations for DesignWare and Cadence endpoint controllers (Niklas Cassel) - Align the return value (number of interrupts) encoding for pci_epc_get_msi()/pci_epc_ops::get_msi() and pci_epc_get_msix()/pci_epc_ops::get_msix() (Niklas Cassel) - Align the nr_irqs parameter encoding for pci_epc_set_msi()/pci_epc_ops::set_msi() and pci_epc_set_msix()/pci_epc_ops::set_msix() (Niklas Cassel) Common host controller library: - Convert pci-host-common to a library so platforms that don't need native host controller drivers don't need to include these helper functions (Manivannan Sadhasivam) Apple PCIe controller driver: - Extract ECAM bridge creation helper from pci_host_common_probe() to separate driver-specific things like MSI from PCI things (Marc Zyngier) - Dynamically allocate RID-to_SID bitmap to prepare for SoCs with varying capabilities (Marc Zyngier) - Skip ports disabled in DT when setting up ports (Janne Grunau) - Add t6020 compatible string (Alyssa Rosenzweig) - Add T602x PCIe support (Hector Martin) - Directly set/clear INTx mask bits because T602x dropped the accessors that could do this without locking (Marc Zyngier) - Move port PHY registers to their own reg items to accommodate T602x, which moves them around; retain default offsets for existing DTs that lack phy%d entries with the reg offsets (Hector Martin) - Stop polling for core refclk, which doesn't work on T602x and the bootloader has already done anyway (Hector Martin) - Use gpiod_set_value_cansleep() when asserting PERST# in probe because we're allowed to sleep there (Hector Martin) Cadence PCIe controller driver: - Drop a runtime PM 'put' to resolve a runtime atomic count underflow (Hans Zhang) - Make the cadence core buildable as a module (Kishon Vijay Abraham I) - Add cdns_pcie_host_disable() and cdns_pcie_ep_disable() for use by loadable drivers when they are removed (Siddharth Vadapalli) Freescale i.MX6 PCIe controller driver: - Apply link training workaround only on IMX6Q, IMX6SX, IMX6SP (Richard Zhu) - Remove redundant dw_pcie_wait_for_link() from imx_pcie_start_link(); since the DWC core does this, imx6 only needs it when retraining for a faster link speed (Richard Zhu) - Toggle i.MX95 core reset to align with PHY powerup (Richard Zhu) - Set SYS_AUX_PWR_DET to work around i.MX95 ERR051624 erratum: in some cases, the controller can't exit 'L23 Ready' through Beacon or PERST# deassertion (Richard Zhu) - Clear GEN3_ZRXDC_NONCOMPL to work around i.MX95 ERR051586 erratum: controller can't meet 2.5 GT/s ZRX-DC timing when operating at 8 GT/s, causing timeouts in L1 (Richard Zhu) - Wait for i.MX95 PLL lock before enabling controller (Richard Zhu) - Save/restore i.MX95 LUT for suspend/resume (Richard Zhu) Mobiveil PCIe controller driver: - Return bool (not int) for link-up check in mobiveil_pab_ops.link_up() and layerscape-gen4, mobiveil (Hans Zhang) NVIDIA Tegra194 PCIe controller driver: - Create debugfs directory for 'aspm_state_cnt' only when CONFIG_PCIEASPM is enabled, since there are no other entries (Hans Zhang) Qualcomm PCIe controller driver: - Add OF support for parsing DT 'eq-presets-<N>gts' property for lane equalization presets (Krishna Chaitanya Chundru) - Read Maximum Link Width from the Link Capabilities register if DT lacks 'num-lanes' property (Krishna Chaitanya Chundru) - Add Physical Layer 64 GT/s Capability ID and register offsets for 8, 32, and 64 GT/s lane equalization registers (Krishna Chaitanya Chundru) - Add generic dwc support for configuring lane equalization presets (Krishna Chaitanya Chundru) - Add DT and driver support for PCIe on IPQ5018 SoC (Nitheesh Sekar) Renesas R-Car PCIe controller driver: - Describe endpoint BAR 4 as being fixed size (Jerome Brunet) - Document how to obtain R-Car V4H (r8a779g0) controller firmware (Yoshihiro Shimoda) Rockchip PCIe controller driver: - Reorder rockchip_pci_core_rsts because reset_control_bulk_deassert() deasserts in reverse order, to fix a link training regression (Jensen Huang) - Mark RK3399 as being capable of raising INTx interrupts (Niklas Cassel) Rockchip DesignWare PCIe controller driver: - Check only PCIE_LINKUP, not LTSSM status, to determine whether the link is up (Shawn Lin) - Increase N_FTS (used in L0s->L0 transitions) and enable ASPM L0s for Root Complex and Endpoint modes (Shawn Lin) - Hide the broken ATS Capability in rockchip_pcie_ep_init() instead of rockchip_pcie_ep_pre_init() so it stays hidden after PERST# resets non-sticky registers (Shawn Lin) - Call phy_power_off() before phy_exit() in rockchip_pcie_phy_deinit() (Diederik de Haas) Synopsys DesignWare PCIe controller driver: - Set PORT_LOGIC_LINK_WIDTH to one lane to make initial link training more robust; this will not affect the intended link width if all lanes are functional (Wenbin Yao) - Return bool (not int) for link-up check in dw_pcie_ops.link_up() and armada8k, dra7xx, dw-rockchip, exynos, histb, keembay, keystone, kirin, meson, qcom, qcom-ep, rcar_gen4, spear13xx, tegra194, uniphier, visconti (Hans Zhang) - Add debugfs support for exposing DWC device-specific PTM context (Manivannan Sadhasivam) TI J721E PCIe driver: - Make j721e buildable as a loadable and removable module (Siddharth Vadapalli) - Fix j721e host/endpoint dependencies that result in link failures in some configs (Arnd Bergmann) Device tree bindings: - Add qcom DT binding for 'global' interrupt (PCIe controller and link-specific events) for ipq8074, ipq8074-gen3, ipq6018, sa8775p, sc7280, sc8180x sdm845, sm8150, sm8250, sm8350 (Manivannan Sadhasivam) - Add qcom DT binding for 8 MSI SPI interrupts for msm8998, ipq8074, ipq8074-gen3, ipq6018 (Manivannan Sadhasivam) - Add dw rockchip DT binding for rk3576 and rk3562 (Kever Yang) - Correct indentation and style of examples in brcm,stb-pcie, cdns,cdns-pcie-ep, intel,keembay-pcie-ep, intel,keembay-pcie, microchip,pcie-host, rcar-pci-ep, rcar-pci-host, xilinx-versal-cpm (Krzysztof Kozlowski) - Convert Marvell EBU (dove, kirkwood, armada-370, armada-xp) and armada8k from text to schema DT bindings (Rob Herring) - Remove obsolete .txt DT bindings for content that has been moved to schemas (Rob Herring) - Add qcom DT binding for MHI registers in IPQ5332, IPQ6018, IPQ8074 and IPQ9574 (Varadarajan Narayanan) - Convert v3,v360epc-pci from text to DT schema binding (Rob Herring) - Change microchip,pcie-host DT binding to be 'dma-noncoherent' since PolarFire may be configured that way (Conor Dooley) Miscellaneous: - Drop 'pci' suffix from intel_mid_pci.c filename to match similar files (Andy Shevchenko) - All platforms with PCI have an MMU, so add PCI Kconfig dependency on MMU to simplify build testing and avoid inadvertent build regressions (Arnd Bergmann) - Update Krzysztof Wilczyński's email address in MAINTAINERS (Krzysztof Wilczyński) - Update Manivannan Sadhasivam's email address in MAINTAINERS (Manivannan Sadhasivam)" * tag 'pci-v6.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (147 commits) MAINTAINERS: Update Manivannan Sadhasivam email address PCI: j721e: Fix host/endpoint dependencies PCI: j721e: Add support to build as a loadable module PCI: cadence-ep: Introduce cdns_pcie_ep_disable() helper for cleanup PCI: cadence-host: Introduce cdns_pcie_host_disable() helper for cleanup PCI: cadence: Add support to build pcie-cadence library as a kernel module MAINTAINERS: Update Krzysztof Wilczyński email address PCI: Remove unnecessary linesplit in __pci_setup_bridge() PCI: WARN (not BUG()) when we fail to assign optional resources PCI: Remove unused pci_printk() PCI: qcom: Replace PERST# sleep time with proper macro PCI: dw-rockchip: Replace PERST# sleep time with proper macro PCI: host-common: Convert to library for host controller drivers PCI/ERR: Remove misleading TODO regarding kernel panic PCI: cadence: Remove duplicate message code definitions PCI: endpoint: Align pci_epc_set_msix(), pci_epc_ops::set_msix() nr_irqs encoding PCI: endpoint: Align pci_epc_set_msi(), pci_epc_ops::set_msi() nr_irqs encoding PCI: endpoint: Align pci_epc_get_msix(), pci_epc_ops::get_msix() return value encoding PCI: endpoint: Align pci_epc_get_msi(), pci_epc_ops::get_msi() return value encoding PCI: cadence-ep: Correct PBA offset in .set_msix() callback ...
2025-05-31Revert "iommu: make inclusion of arm/arm-smmu-v3 directory conditional"Linus Torvalds
This reverts commit e436576b0231542f6f233279f0972989232575a8. That commit is very broken, and seems to have missed the fact that CONFIG_ARM_SMMU_V3 is not just a yes-or-no thing, but also can be modular. So it caused build errors on arm64 allmodconfig setups: ERROR: modpost: "arm_smmu_make_cdtable_ste" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined! ERROR: modpost: "arm_smmu_make_s2_domain_ste" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined! ERROR: modpost: "arm_smmu_make_s1_cd" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined! ... (and six more symbols just the same). Link: https://lore.kernel.org/all/CAHk-=wh4qRwm7AQ8sBmQj7qECzgAhj4r73RtCDfmHo5SdcN0Jw@mail.gmail.com/ Cc: Joerg Roedel <joro@8bytes.org> Cc: Rolf Eike Beer <eb@emlix.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-05-30Merge tag 'iommu-updates-v6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Joerg Roedel: "Core: - Introduction of iommu-pages infrastructure to consolitate page-table allocation code among hardware drivers. This is ground-work for more generalization in the future - Remove IOMMU_DEV_FEAT_SVA and IOMMU_DEV_FEAT_IOPF feature flags - Convert virtio-iommu to domain_alloc_paging() - KConfig cleanups - Some small fixes for possible overflows and race conditions Intel VT-d driver: - Restore WO permissions on second-level paging entries - Use ida to manage domain id - Miscellaneous cleanups AMD-Vi: - Make sure notifiers finish running before module unload - Add support for HTRangeIgnore feature - Allow matching ACPI HID devices without matching UIDs ARM-SMMU: - SMMUv2: - Recognise the compatible string for SAR2130P MDSS in the Qualcomm driver, as this device requires an identity domain - Fix Adreno stall handling so that GPU debugging is more robust and doesn't e.g. result in deadlock - SMMUv3: - Fix ->attach_dev() error reporting for unrecognised domains - IO-pgtable: - Allow clients (notably, drivers that process requests from userspace) to silence warnings when mapping an already-mapped IOVA S390: - Add support for additional table regions Mediatek: - Add support for MT6893 MM IOMMU And some smaller fixes and improvements in various other drivers" * tag 'iommu-updates-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (75 commits) iommu/vt-d: Restore context entry setup order for aliased devices iommu/mediatek: Fix compatible typo for mediatek,mt6893-iommu-mm iommu/arm-smmu-qcom: Make set_stall work when the device is on iommu/arm-smmu: Move handing of RESUME to the context fault handler iommu/arm-smmu-qcom: Enable threaded IRQ for Adreno SMMUv2/MMU500 iommu/io-pgtable-arm: Add quirk to quiet WARN_ON() iommu: Clear the freelist after iommu_put_pages_list() iommu/vt-d: Change dmar_ats_supported() to return boolean iommu/vt-d: Eliminate pci_physfn() in dmar_find_matched_satc_unit() iommu/vt-d: Replace spin_lock with mutex to protect domain ida iommu/vt-d: Use ida to manage domain id iommu/vt-d: Restore WO permissions on second-level paging entries iommu/amd: Allow matching ACPI HID devices without matching UIDs iommu: make inclusion of arm/arm-smmu-v3 directory conditional iommu: make inclusion of riscv directory conditional iommu: make inclusion of amd directory conditional iommu: make inclusion of intel directory conditional iommu: remove duplicate selection of DMAR_TABLE iommu/fsl_pamu: remove trailing space after \n iommu/arm-smmu-qcom: Add SAR2130P MDSS compatible ...
2025-05-27Merge tag 'dma-mapping-6.16-2025-05-26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux Pull dma-mapping updates from Marek Szyprowski: "New two step DMA mapping API, which is is a first step to a long path to provide alternatives to scatterlist and to remove hacks, abuses and design mistakes related to scatterlists. This new approach optimizes some calls to DMA-IOMMU layer and cache maintenance by batching them, reduces memory usage as it is no need to store mapped DMA addresses to unmap them, and reduces some function call overhead. It is a combination effort of many people, lead and developed by Christoph Hellwig and Leon Romanovsky" * tag 'dma-mapping-6.16-2025-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: docs: core-api: document the IOVA-based API dma-mapping: add a dma_need_unmap helper dma-mapping: Implement link/unlink ranges API iommu/dma: Factor out a iommu_dma_map_swiotlb helper dma-mapping: Provide an interface to allow allocate IOVA iommu: add kernel-doc for iommu_unmap_fast iommu: generalize the batched sync after map interface dma-mapping: move the PCI P2PDMA mapping helpers to pci-p2pdma.h PCI/P2PDMA: Refactor the p2pdma mapping helpers
2025-05-23Merge branches 'fixes', 'apple/dart', 'arm/smmu/updates', ↵Joerg Roedel
'arm/smmu/bindings', 'fsl/pamu', 'mediatek', 'renesas/ipmmu', 's390', 'intel/vt-d', 'amd/amd-vi' and 'core' into next
2025-05-23iommu/vt-d: Restore context entry setup order for aliased devicesLu Baolu
Commit 2031c469f816 ("iommu/vt-d: Add support for static identity domain") changed the context entry setup during domain attachment from a set-and-check policy to a clear-and-reset approach. This inadvertently introduced a regression affecting PCI aliased devices behind PCIe-to-PCI bridges. Specifically, keyboard and touchpad stopped working on several Apple Macbooks with below messages: kernel: platform pxa2xx-spi.3: Adding to iommu group 20 kernel: input: Apple SPI Keyboard as /devices/pci0000:00/0000:00:1e.3/pxa2xx-spi.3/spi_master/spi2/spi-APP000D:00/input/input0 kernel: DMAR: DRHD: handling fault status reg 3 kernel: DMAR: [DMA Read NO_PASID] Request device [00:1e.3] fault addr 0xffffa000 [fault reason 0x06] PTE Read access is not set kernel: DMAR: DRHD: handling fault status reg 3 kernel: DMAR: [DMA Read NO_PASID] Request device [00:1e.3] fault addr 0xffffa000 [fault reason 0x06] PTE Read access is not set kernel: applespi spi-APP000D:00: Error writing to device: 01 0e 00 00 kernel: DMAR: DRHD: handling fault status reg 3 kernel: DMAR: [DMA Read NO_PASID] Request device [00:1e.3] fault addr 0xffffa000 [fault reason 0x06] PTE Read access is not set kernel: DMAR: DRHD: handling fault status reg 3 kernel: applespi spi-APP000D:00: Error writing to device: 01 0e 00 00 Fix this by restoring the previous context setup order. Fixes: 2031c469f816 ("iommu/vt-d: Add support for static identity domain") Closes: https://lore.kernel.org/all/4dada48a-c5dd-4c30-9c85-5b03b0aa01f0@bfh.ch/ Cc: stable@vger.kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20250514060523.2862195-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20250520075849.755012-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-22iommu/mediatek: Fix compatible typo for mediatek,mt6893-iommu-mmAngeloGioacchino Del Regno
Fix the "mediatek.mt6893-iommu-mm" compatible string typo, as the dot was actually meant to be a comma: "mediatek,mt6893-iommu-mm". Fixes: f6a1e89ab6e3 ("iommu/mediatek: Add support for Dimensity 1200 MT6893 MM IOMMU") Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20250521151548.185910-1-angelogioacchino.delregno@collabora.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-22iommu: Skip PASID validation for devices without PASID capabilityTushar Dave
Generally PASID support requires ACS settings that usually create single device groups, but there are some niche cases where we can get multi-device groups and still have working PASID support. The primary issue is that PCI switches are not required to treat PASID tagged TLPs specially so appropriate ACS settings are required to route all TLPs to the host bridge if PASID is going to work properly. pci_enable_pasid() does check that each device that will use PASID has the proper ACS settings to achieve this routing. However, no-PASID devices can be combined with PASID capable devices within the same topology using non-uniform ACS settings. In this case the no-PASID devices may not have strict route to host ACS flags and end up being grouped with the PASID devices. This configuration fails to allow use of the PASID within the iommu core code which wrongly checks if the no-PASID device supports PASID. Fix this by ignoring no-PASID devices during the PASID validation. They will never issue a PASID TLP anyhow so they can be ignored. Fixes: c404f55c26fc ("iommu: Validate the PASID in iommu_attach_device_pasid()") Cc: stable@vger.kernel.org Signed-off-by: Tushar Dave <tdave@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20250520011937.3230557-1-tdave@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-21iommu/arm-smmu-qcom: Make set_stall work when the device is onConnor Abbott
Up until now we have only called the set_stall callback during initialization when the device is off. But we will soon start calling it to temporarily disable stall-on-fault when the device is on, so handle that by checking if the device is on and writing SCTLR. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Link: https://lore.kernel.org/r/20250520-msm-gpu-fault-fixes-next-v8-3-fce6ee218787@gmail.com [will: Fix "mixed declarations and code" warning from sparse] Signed-off-by: Will Deacon <will@kernel.org>
2025-05-21iommu/arm-smmu: Move handing of RESUME to the context fault handlerConnor Abbott
The upper layer fault handler is now expected to handle everything required to retry the transaction or dump state related to it, since we enable threaded IRQs. This means that we can take charge of writing RESUME, making sure that we always write it after writing FSR as recommended by the specification. The iommu handler should write -EAGAIN if a transaction needs to be retried. This avoids tricky cross-tree changes in drm/msm, since it never wants to retry the transaction and it already returns 0 from its fault handler. Therefore it will continue to correctly terminate the transaction without any changes required. devcoredumps from drm/msm will temporarily be broken until it is fixed to collect devcoredumps inside its fault handler, but fixing that first would actually be worse because MMU-500 ignores writes to RESUME unless all fields of FSR (except SS of course) are clear and raises an interrupt when only SS is asserted. Right now, things happen to work most of the time if we collect a devcoredump, because RESUME is written asynchronously in the fault worker after the fault handler clears FSR and finishes, although there will be some spurious faults, but if this is changed before this commit fixes the FSR/RESUME write order then SS will never be cleared, the interrupt will never be cleared, and the whole system will hang every time a fault happens. It will therefore help bisectability if this commit goes first. I've changed the TBU path to also accept -EAGAIN and do the same thing, while keeping the old -EBUSY behavior. Although the old path was broken because you'd get a storm of interrupts due to returning IRQ_NONE that would eventually result in the interrupt being disabled, and I think it was dead code anyway, so it should eventually be deleted. Note that drm/msm never uses TBU so this is untested. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Link: https://lore.kernel.org/r/20250520-msm-gpu-fault-fixes-next-v8-2-fce6ee218787@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-21iommu/arm-smmu-qcom: Enable threaded IRQ for Adreno SMMUv2/MMU500Connor Abbott
The recommended flow for stall-on-fault in SMMUv2 is the following: 1. Resolve the fault. 2. Write to FSR to clear the fault bits. 3. Write RESUME to retry or fail the transaction. MMU500 is designed with this sequence in mind. For example, experimentally we have seen on MMU500 that writing RESUME does not clear FSR.SS unless the original fault is cleared in FSR, so 2 must come before 3. FSR.SS is allowed to signal a fault (and does on MMU500) so that if we try to do 2 -> 1 -> 3 (while exiting from the fault handler after 2) we can get duplicate faults without hacks to disable interrupts. However, resolving the fault typically requires lengthy operations that can stall, like bringing in pages from disk. The only current user, drm/msm, dumps GPU state before failing the transaction which indeed can stall. Therefore, from now on we will require implementations that want to use stall-on-fault to also enable threaded IRQs. Do that with the Adreno MMU implementations. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Link: https://lore.kernel.org/r/20250520-msm-gpu-fault-fixes-next-v8-1-fce6ee218787@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-20iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()Rob Clark
In situations where mapping/unmapping sequence can be controlled by userspace, attempting to map over a region that has not yet been unmapped is an error. But not something that should spam dmesg. Now that there is a quirk, we can also drop the selftest_running flag, and use the quirk instead for selftests. Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20250519175348.11924-6-robdclark@gmail.com [will: Rename quirk to IO_PGTABLE_QUIRK_NO_WARN per Robin's suggestion] Signed-off-by: Will Deacon <will@kernel.org>
2025-05-16iommu: Clear the freelist after iommu_put_pages_list()Jason Gunthorpe
The commit below reworked how iommu_put_pages_list() worked to not do list_del() on every entry. This was done expecting all the callers to already re-init the list so doing a per-item deletion is not efficient. It was missed that fq_ring_free_locked() re-uses its list after calling iommu_put_pages_list() and so the leftover list reaches free'd struct pages and will crash or WARN/BUG/etc. Reinit the list to empty in fq_ring_free_locked() after calling iommu_put_pages_list(). Audit to see if any other callers of iommu_put_pages_list() need the list to be empty: - iommu_dma_free_fq_single() and iommu_dma_free_fq_percpu() immediately frees the memory - iommu_v1_map_pages(), v1_free_pgtable(), domain_exit(), riscv_iommu_map_pages() uses a stack variable which goes out of scope - intel_iommu_tlb_sync() uses a gather in a iotlb_sync() callback, the caller re-inits the gather Fixes: 13f43d7cf3e0 ("iommu/pages: Formalize the freelist API") Reported-by: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com> Closes: https://lore.kernel.org/r/SJ1PR11MB61292CE72D7BE06B8810021CB997A@SJ1PR11MB6129.namprd11.prod.outlook.com Tested-by: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/0-v1-7d4dfa6140f7+11f04-iommu_freelist_init_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16iommu/vt-d: Change dmar_ats_supported() to return booleanWei Wang
According to "Function return values and names" in coding-style.rst, the dmar_ats_supported() function should return a boolean instead of an integer. Also, rename "ret" to "supported" to be more straightforward. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20250509140021.4029303-3-wei.w.wang@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16iommu/vt-d: Eliminate pci_physfn() in dmar_find_matched_satc_unit()Wei Wang
The function dmar_find_matched_satc_unit() contains a duplicate call to pci_physfn(). This call is unnecessary as pci_physfn() has already been invoked by the caller. Removing the redundant call simplifies the code and improves efficiency a bit. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Link: https://lore.kernel.org/r/20250509140021.4029303-2-wei.w.wang@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16iommu/vt-d: Replace spin_lock with mutex to protect domain idaLu Baolu
The domain ID allocator is currently protected by a spin_lock. However, ida_alloc_range can potentially block if it needs to allocate memory to grow its internal structures. Replace the spin_lock with a mutex which allows sleep on block. Thus, the memory allocation flags can be updated from GFP_ATOMIC to GFP_KERNEL to allow blocking memory allocations if necessary. Introduce a new mutex, did_lock, specifically for protecting the domain ida. The existing spinlock will remain for protecting other intel_iommu fields. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20250430021135.2370244-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16iommu/vt-d: Use ida to manage domain idLu Baolu
Switch the intel iommu driver to use the ida mechanism for managing domain IDs, replacing the previous fixed-size bitmap. The previous approach allocated a bitmap large enough to cover the maximum number of domain IDs supported by the hardware, regardless of the actual number of domains in use. This led to unnecessary memory consumption, especially on systems supporting a large number of iommu units but only utilizing a small number of domain IDs. The ida allocator dynamically manages the allocation and freeing of integer IDs, only consuming memory for the IDs that are currently in use. This significantly optimizes memory usage compared to the fixed-size bitmap. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20250430021135.2370244-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16iommu/vt-d: Restore WO permissions on second-level paging entriesJason Gunthorpe
VT-D HW can do WO permissions on the second-stage but not the first-stage page table formats. The commit eea53c581688 ("iommu/vt-d: Remove WO permissions on second-level paging entries") wanted to make this uniform for VT-D by disabling the support for WO permissions in the second-stage. This isn't consistent with how other drivers are working. Instead if the underlying HW can support WO, it should. For instance AMD already supports WO on its second stage (v1) format and not its first (v2). If WO support needs to be discoverable it should be done through an iommu_domain capability flag. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/0-v1-c26553717e90+65f-iommu_vtd_ss_wo_jgg@nvidia.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16iommu/amd: Allow matching ACPI HID devices without matching UIDsMario Limonciello
A BIOS upgrade has changed the IVRS DTE UID for a device that no longer matches the UID in the SSDT. In this case there is only one ACPI device on the system with that _HID but the _UID mismatch. IVRS: ``` Subtable Type : F0 [Device Entry: ACPI HID Named Device] Device ID : 0060 Data Setting (decoded below) : 40 INITPass : 0 EIntPass : 0 NMIPass : 0 Reserved : 0 System MGMT : 0 LINT0 Pass : 1 LINT1 Pass : 0 ACPI HID : "MSFT0201" ACPI CID : 0000000000000000 UID Format : 02 UID Length : 09 UID : "\_SB.MHSP" ``` SSDT: ``` Device (MHSP) { Name (_ADR, Zero) // _ADR: Address Name (_HID, "MSFT0201") // _HID: Hardware ID Name (_UID, One) // _UID: Unique ID ``` To handle this case; while enumerating ACPI devices in get_acpihid_device_id() count the number of matching ACPI devices with a matching _HID. If there is exactly one _HID match then accept it even if the UID doesn't match. Other operating systems allow this, but the current IVRS spec leaves some ambiguity whether to allow or disallow it. This should be clarified in future revisions of the spec. Output 'Firmware Bug' for this case to encourage it to be solved in the BIOS. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20250512173129.1274275-1-superm1@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16iommu: make inclusion of arm/arm-smmu-v3 directory conditionalRolf Eike Beer
Nothing in there is active if CONFIG_ARM_SMMU_V3 is not enabled, so the whole directory can depend on that switch as well. Fixes: e86d1aa8b60f ("iommu/arm-smmu: Move Arm SMMU drivers into their own subdirectory") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/2434059.NG923GbCHz@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16iommu: make inclusion of riscv directory conditionalRolf Eike Beer
Nothing in there is active if CONFIG_RISCV_IOMMU is not enabled, so the whole directory can depend on that switch as well. Fixes: 5c0ebbd3c6c6 ("iommu/riscv: Add RISC-V IOMMU platform device driver") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/2235451.Icojqenx9y@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16iommu: make inclusion of amd directory conditionalRolf Eike Beer
Nothing in there is active if CONFIG_AMD_IOMMU is not enabled, so the whole directory can depend on that switch as well. Fixes: cbe94c6e1a7d ("iommu/amd: Move Kconfig and Makefile bits down into amd directory") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1894970.atdPhlSkOF@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16iommu: make inclusion of intel directory conditionalRolf Eike Beer
Nothing in there is active if CONFIG_INTEL_IOMMU is not enabled, so the whole directory can depend on that switch as well. Fixes: ab65ba57e3ac ("iommu/vt-d: Move Kconfig and Makefile bits down into intel directory") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/3818749.MHq7AAxBmi@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16iommu: remove duplicate selection of DMAR_TABLERolf Eike Beer
This is already done in intel/Kconfig. Fixes: 70bad345e622 ("iommu: Fix compilation without CONFIG_IOMMU_INTEL") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/2232605.Mh6RI2rZIc@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-16iommu/fsl_pamu: remove trailing space after \nColin Ian King
There is an extraenous space after \n in a pr_debug message. Remove it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20250430151853.923614-1-colin.i.king@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-05-15Revert "iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices"Lukas Wunner
Commit 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()") changed IRQ handling on PCI driver probing. It inadvertently broke resume from system sleep on AMD platforms: https://lore.kernel.org/r/20150926164651.GA3640@pd.tnic/ This was fixed by two independent commits: * 8affb487d4a4 ("x86/PCI: Don't alloc pcibios-irq when MSI is enabled") * cbbc00be2ce3 ("iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices") The breaking change and one of these two fixes were subsequently reverted: * fe25d078874f ("Revert "x86/PCI: Don't alloc pcibios-irq when MSI is enabled"") * 6c777e8799a9 ("Revert "PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()"") This rendered the second fix unnecessary, so revert it as well. It used the match_driver flag in struct pci_dev, which is internal to the PCI core and not supposed to be touched by arbitrary drivers. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://patch.msgid.link/9a3ddff5cc49512044f963ba0904347bd404094d.1745572340.git.lukas@wunner.de
2025-05-06iommu/arm-smmu-qcom: Add SAR2130P MDSS compatibleDmitry Baryshkov
Add the SAR2130P compatible to clients compatible list, the device require identity domain. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250418-sar2130p-display-v5-9-442c905cb3a4@oss.qualcomm.com Signed-off-by: Will Deacon <will@kernel.org>