summaryrefslogtreecommitdiff
path: root/drivers/pci
AgeCommit message (Collapse)Author
2023-04-12PCI: qcom: Add support for system suspend and resumeManivannan Sadhasivam
During the system suspend, vote for minimal interconnect bandwidth (1KiB) to keep the interconnect path active for config access and also turn OFF the resources like clock and PHY if there are no active devices connected to the controller. For the controllers with active devices, the resources are kept ON as removing the resources will trigger access violation during the late end of suspend cycle as kernel tries to access the config space of PCIe devices to mask the MSIs. Also, it is not desirable to put the link into L2/L3 state as that implies VDD supply will be removed and the devices may go into powerdown state. This will affect the lifetime of storage devices like NVMe. And finally, during resume, turn ON the resources if the controller was truly suspended (resources OFF) and update the interconnect bandwidth based on PCIe Gen speed. Suggested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> Link: https://lore.kernel.org/r/20230403154922.20704-2-manivannan.sadhasivam@linaro.org Tested-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Acked-by: Dhruva Gole <d-gole@ti.com>
2023-04-11PCI/PM: Drop pci_bridge_wait_for_secondary_bus() timeout parameterMika Westerberg
All callers of pci_bridge_wait_for_secondary_bus() supply a timeout of PCIE_RESET_READY_POLL_MS, so drop the parameter. Move the definition of PCIE_RESET_READY_POLL_MS into pci.c, the only user. [bhelgaas: extracted from https://lore.kernel.org/r/20230404052714.51315-3-mika.westerberg@linux.intel.com] Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-04-11PCI/PM: Increase wait time after resumeMika Westerberg
PCIe r6.0 sec 6.6.1 prescribes that a device must be able to respond to config requests within 1.0 s (PCI_RESET_WAIT) after exiting conventional reset and this same delay is prescribed when coming out of D3cold (as that involves reset too). A device that requires more than 1 second to initialize after reset may respond to config requests with Request Retry Status completions (sec 2.3.1), and we accommodate that in Linux with a 60 second cap (PCIE_RESET_READY_POLL_MS). Previously we waited up to PCIE_RESET_READY_POLL_MS only in the reset code path, not in the resume path. However, a device has surfaced, namely Intel Titan Ridge xHCI, which requires a longer delay also in the resume code path. Make the resume code path to use this same extended delay as the reset path. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216728 Link: https://lore.kernel.org/r/20230404052714.51315-2-mika.westerberg@linux.intel.com Reported-by: Chris Chiu <chris.chiu@canonical.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Lukas Wunner <lukas@wunner.de>
2023-04-11Merge tag 'pci-v6.3-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Provide pci_msix_can_alloc_dyn() stub when CONFIG_PCI_MSI unset to avoid build errors (Reinette Chatre) - Quirk AMD XHCI controller that loses MSI-X state in D3hot to avoid broken USB after hotplug or suspend/resume (Basavaraj Natikar) - Fix use-after-free in pci_bus_release_domain_nr() (Rob Herring) * tag 'pci-v6.3-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI: Fix use-after-free in pci_bus_release_domain_nr() x86/PCI: Add quirk for AMD XHCI controller that loses MSI-X state in D3hot PCI/MSI: Provide missing stub for pci_msix_can_alloc_dyn()
2023-04-11PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lockLukas Wunner
In 2013, commits 2e35afaefe64 ("PCI: pciehp: Add reset_slot() method") 608c388122c7 ("PCI: Add slot reset option to pci_dev_reset()") amended PCIe hotplug to mask Presence Detect Changed events during a Secondary Bus Reset. The reset thus no longer causes gratuitous slot bringdown and bringup. However the commits neglected to serialize reset with code paths reading slot registers. For instance, a slot bringup due to an earlier hotplug event may see the Presence Detect State bit cleared during a concurrent Secondary Bus Reset. In 2018, commit 5b3f7b7d062b ("PCI: pciehp: Avoid slot access during reset") retrofitted the missing locking. It introduced a reset_lock which serializes a Secondary Bus Reset with other parts of pciehp. Unfortunately the locking turns out to be overzealous: reset_lock is held for the entire enumeration and de-enumeration of hotplugged devices, including driver binding and unbinding. Driver binding and unbinding acquires device_lock while the reset_lock of the ancestral hotplug port is held. A concurrent Secondary Bus Reset acquires the ancestral reset_lock while already holding the device_lock. The asymmetric locking order in the two code paths can lead to AB-BA deadlocks. Michael Haeuptle reports such deadlocks on simultaneous hot-removal and vfio release (the latter implies a Secondary Bus Reset): pciehp_ist() # down_read(reset_lock) pciehp_handle_presence_or_link_change() pciehp_disable_slot() __pciehp_disable_slot() remove_board() pciehp_unconfigure_device() pci_stop_and_remove_bus_device() pci_stop_bus_device() pci_stop_dev() device_release_driver() device_release_driver_internal() __device_driver_lock() # device_lock() SYS_munmap() vfio_device_fops_release() vfio_device_group_close() vfio_device_close() vfio_device_last_close() vfio_pci_core_close_device() vfio_pci_core_disable() # device_lock() __pci_reset_function_locked() pci_reset_bus_function() pci_dev_reset_slot_function() pci_reset_hotplug_slot() pciehp_reset_slot() # down_write(reset_lock) Ian May reports the same deadlock on simultaneous hot-removal and an AER-induced Secondary Bus Reset: aer_recover_work_func() pcie_do_recovery() aer_root_reset() pci_bus_error_reset() pci_slot_reset() pci_slot_lock() # device_lock() pci_reset_hotplug_slot() pciehp_reset_slot() # down_write(reset_lock) Fix by releasing the reset_lock during driver binding and unbinding, thereby splitting and shrinking the critical section. Driver binding and unbinding is protected by the device_lock() and thus serialized with a Secondary Bus Reset. There's no need to additionally protect it with the reset_lock. However, pciehp does not bind and unbind devices directly, but rather invokes PCI core functions which also perform certain enumeration and de-enumeration steps. The reset_lock's purpose is to protect slot registers, not enumeration and de-enumeration of hotplugged devices. That would arguably be the job of the PCI core, not the PCIe hotplug driver. After all, an AER-induced Secondary Bus Reset may as well happen during boot-time enumeration of the PCI hierarchy and there's no locking to prevent that either. Exempting *de-enumeration* from the reset_lock is relatively harmless: A concurrent Secondary Bus Reset may foil config space accesses such as PME interrupt disablement. But if the device is physically gone, those accesses are pointless anyway. If the device is physically present and only logically removed through an Attention Button press or the sysfs "power" attribute, PME interrupts as well as DMA cannot come through because pciehp_unconfigure_device() disables INTx and Bus Master bits. That's still protected by the reset_lock in the present commit. Exempting *enumeration* from the reset_lock also has limited impact: The exempted call to pci_bus_add_device() may perform device accesses through pcibios_bus_add_device() and pci_fixup_device() which are now no longer protected from a concurrent Secondary Bus Reset. Otherwise there should be no impact. In essence, the present commit seeks to fix the AB-BA deadlocks while still retaining a best-effort reset protection for enumeration and de-enumeration of hotplugged devices -- until a general solution is implemented in the PCI core. Link: https://lore.kernel.org/linux-pci/CS1PR8401MB0728FC6FDAB8A35C22BD90EC95F10@CS1PR8401MB0728.NAMPRD84.PROD.OUTLOOK.COM Link: https://lore.kernel.org/linux-pci/20200615143250.438252-1-ian.may@canonical.com Link: https://lore.kernel.org/linux-pci/ce878dab-c0c4-5bd0-a725-9805a075682d@amd.com Link: https://lore.kernel.org/linux-pci/ed831249-384a-6d35-0831-70af191e9bce@huawei.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590 Fixes: 5b3f7b7d062b ("PCI: pciehp: Avoid slot access during reset") Link: https://lore.kernel.org/r/fef2b2e9edf245c049a8c5b94743c0f74ff5008a.1681191902.git.lukas@wunner.de Reported-by: Michael Haeuptle <michael.haeuptle@hpe.com> Reported-by: Ian May <ian.may@canonical.com> Reported-by: Andrey Grodzovsky <andrey2805@gmail.com> Reported-by: Rahul Kumar <rahul.kumar1@amd.com> Reported-by: Jialin Zhang <zhangjialin11@huawei.com> Tested-by: Anatoli Antonovitch <Anatoli.Antonovitch@amd.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.19+ Cc: Dan Stein <dstein@hpe.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Alex Michon <amichon@kalrayinc.com> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-04-11PCI: qcom: Expose link transition counts via debugfsManivannan Sadhasivam
Qualcomm PCIe controllers have debug registers in the MHI region that count PCIe link transitions. Expose them over debugfs to userspace to help debug the low power issues. Note that even though the registers are prefixed as PARF_, they don't live under the "parf" register region. The register naming is following the Qualcomm's internal documentation as like other registers. While at it, let's arrange the local variables in probe function to follow reverse XMAS tree order. Link: https://lore.kernel.org/r/20230316081117.14288-20-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Rename qcom_pcie_config_sid_sm8250() to reflect IP versionManivannan Sadhasivam
qcom_pcie_config_sid_sm8250() function no longer applies only to SM8250. So let's rename it to reflect the actual IP version and also move its definition to keep it sorted as per IP revisions. Link: https://lore.kernel.org/r/20230316081117.14288-15-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Use macros for defining total no. of clocks & suppliesManivannan Sadhasivam
To keep uniformity, let's use macros to define the total number of clocks and supplies in qcom_pcie_resources_{2_7_0/2_9_0} structs. Link: https://lore.kernel.org/r/20230316081117.14288-14-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Use bulk reset APIs for handling resets for IP rev 2.4.0Manivannan Sadhasivam
All the resets are asserted and deasserted at the same time. So the bulk reset APIs can be used to handle them together. This simplifies the code a lot. It should be noted that there were delays in-between the reset asserts and deasserts. But going by the config used by other revisions, those delays are not really necessary. So a single delay after all asserts and one after deasserts is used. The total number of resets supported is 12 but only ipq4019 is using all of them. Link: https://lore.kernel.org/r/20230316081117.14288-13-manivannan.sadhasivam@linaro.org Tested-by: Sricharan Ramabadhran <quic_srichara@quicinc.com> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Use bulk reset APIs for handling resets for IP rev 2.3.3Manivannan Sadhasivam
All the resets are asserted and deasserted at the same time. So the bulk reset APIs can be used to handle them together. This simplifies the code a lot. Link: https://lore.kernel.org/r/20230316081117.14288-12-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Use bulk clock APIs for handling clocks for IP rev 2.3.3Manivannan Sadhasivam
All the clocks are enabled and disabled at the same time. So the bulk clock APIs can be used to handle them together. This simplifies the code a lot. Link: https://lore.kernel.org/r/20230316081117.14288-11-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Use bulk clock APIs for handling clocks for IP rev 2.3.2Manivannan Sadhasivam
All the clocks are enabled and disabled at the same time. So the bulk clock APIs can be used to handle them together. This simplifies the code a lot. Link: https://lore.kernel.org/r/20230316081117.14288-10-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Use bulk clock APIs for handling clocks for IP rev 1.0.0Manivannan Sadhasivam
All the clocks are enabled and disabled at the same time. So the bulk clock APIs can be used to handle them together. This simplifies the code a lot. Link: https://lore.kernel.org/r/20230316081117.14288-9-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Use bulk reset APIs for handling resets for IP rev 2.1.0Manivannan Sadhasivam
All the resets are asserted and deasserted at the same time. So the bulk reset APIs can be used to handle them together. This simplifies the code a lot. While at it, let's also move the qcom_pcie_resources_2_1_0 struct below qcom_pcie_resources_1_0_0 to keep it sorted. Link: https://lore.kernel.org/r/20230316081117.14288-8-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Use lower case for hexManivannan Sadhasivam
To maintain uniformity, let's use lower case for representing hexadecimal numbers. Link: https://lore.kernel.org/r/20230316081117.14288-7-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Add missing macros for register fieldsManivannan Sadhasivam
Some of the registers are changed using hardcoded bitfields without macros. This provides no information on what the register setting is about. So add the macros to those fields for making the code more understandable. Link: https://lore.kernel.org/r/20230316081117.14288-6-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Use bitfield definitions for register fieldsManivannan Sadhasivam
To maintain uniformity throughout the driver and also to make the code easier to read, let's make use of bitfield definitions for register fields. Link: https://lore.kernel.org/r/20230316081117.14288-5-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Sort and group registers and bitfield definitionsManivannan Sadhasivam
Sorting the registers and their bit definitions will make it easier to add more definitions in the future and it also helps in maintenance. While at it, let's also group the registers and bit definitions separately as done in the pcie-qcom-ep driver. Link: https://lore.kernel.org/r/20230316081117.14288-4-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Remove PCIE20_ prefix from register definitionsManivannan Sadhasivam
The PCIE part is redundant and 20 doesn't represent anything across the SoCs supported now. So let's get rid of the prefix. This involves adding the IP version suffix to one definition of PARF_SLV_ADDR_SPACE_SIZE that defines offset specific to that version. The other definition is generic for the rest of the versions. Also, the register PCIE20_LNK_CONTROL2_LINK_STATUS2 is not used anywhere, hence removed. Link: https://lore.kernel.org/r/20230316081117.14288-3-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-04-11PCI: qcom: Fix the incorrect register usage in v2.7.0 configManivannan Sadhasivam
Qcom PCIe IP version v2.7.0 and its derivatives don't contain the PCIE20_PARF_AXI_MSTR_WR_ADDR_HALT register. Instead, they have the new PCIE20_PARF_AXI_MSTR_WR_ADDR_HALT_V2 register. So fix the incorrect register usage which is modifying a different register. Also in this IP version, this register change doesn't depend on MSI being enabled. So remove that check also. Link: https://lore.kernel.org/r/20230316081117.14288-2-manivannan.sadhasivam@linaro.org Fixes: ed8cc3b1fc84 ("PCI: qcom: Add support for SDM845 PCIe controller") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: <stable@vger.kernel.org> # 5.6+
2023-04-09Merge tag 'cxl-fixes-6.3-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull compute express link (cxl) fixes from Dan Williams: "Several fixes for driver startup regressions that landed during the merge window as well as some older bugs. The regressions were due to a lack of testing with what the CXL specification calls Restricted CXL Host (RCH) topologies compared to the testing with Virtual Host (VH) CXL topologies. A VH topology is typical PCIe while RCH topologies map CXL endpoints as Root Complex Integrated endpoints. The impact is some driver crashes on startup. This merge window also added compatibility for range registers (the mechanism that CXL 1.1 defined for mapping memory) to treat them like HDM decoders (the mechanism that CXL 2.0 defined for mapping Host-managed Device Memory). That work collided with the new region enumeration code that was tested with CXL 2.0 setups, and fails with crashes at startup. Lastly, the DOE (Data Object Exchange) implementation for retrieving an ACPI-like data table from CXL devices is being reworked for v6.4. Several fixes fell out of that work that are suitable for v6.3. All of this has been in linux-next for a while, and all reported issues [1] have been addressed. Summary: - Fix several issues with region enumeration in RCH topologies that can trigger crashes on driver startup or shutdown. - Fix CXL DVSEC range register compatibility versus region enumeration that leads to startup crashes - Fix CDAT endiannes handling - Fix multiple buffer handling boundary conditions - Fix Data Object Exchange (DOE) workqueue usage vs CONFIG_DEBUG_OBJECTS warn splats" Link: http://lore.kernel.org/r/20230405075704.33de8121@canb.auug.org.au [1] * tag 'cxl-fixes-6.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/hdm: Extend DVSEC range register emulation for region enumeration cxl/hdm: Limit emulation to the number of range registers cxl/region: Move coherence tracking into cxl_region_attach() cxl/region: Fix region setup/teardown for RCDs cxl/port: Fix find_cxl_root() for RCDs and simplify it cxl/hdm: Skip emulation when driver manages mem_enable cxl/hdm: Fix double allocation of @cxlhdm PCI/DOE: Fix memory leak with CONFIG_DEBUG_OBJECTS=y PCI/DOE: Silence WARN splat with CONFIG_DEBUG_OBJECTS=y cxl/pci: Handle excessive CDAT length cxl/pci: Handle truncated CDAT entries cxl/pci: Handle truncated CDAT header cxl/pci: Fix CDAT retrieval on big endian
2023-04-07PCI/EDR: Add edr_handle_event() commentsBjorn Helgaas
EDR documentation is a bit sketchy. Add a couple comments to edr_handle_event() about the devices involved. Link: https://lore.kernel.org/r/20230407215259.GA3825733@bhelgaas Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-04-07PCI/EDR: Clear Device Status after EDR error recoveryKuppuswamy Sathyanarayanan
During EDR recovery, the OS must clear error status of the port that triggered DPC even if firmware retains control of DPC and AER (see the implementation note in the PCI Firmware spec r3.3, sec 4.6.12). Prior to 068c29a248b6 ("PCI/ERR: Clear PCIe Device Status errors only if OS owns AER"), the port Device Status was cleared in this path: edr_handle_event dpc_process_error(dev) # "dev" triggered DPC pcie_do_recovery(dev, dpc_reset_link) dpc_reset_link # exit DPC pcie_clear_device_status(dev) # clear Device Status After 068c29a248b6, pcie_do_recovery() no longer clears Device Status when firmware controls AER, so the error bit remains set even after recovery. Per the "Downstream Port Containment configuration control" bit in the returned _OSC Control Field (sec 4.5.1), the OS is allowed to clear error status until it evaluates _OST, so clear Device Status in edr_handle_event() if the error recovery was successful. [bhelgaas: commit log] Fixes: 068c29a248b6 ("PCI/ERR: Clear PCIe Device Status errors only if OS owns AER") Link: https://lore.kernel.org/r/20230315235449.1279209-1-sathyanarayanan.kuppuswamy@linux.intel.com Reported-by: Tsaur Erwin <erwin.tsaur@intel.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-04-06PCI: Fix use-after-free in pci_bus_release_domain_nr()Rob Herring
Commit c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()") introduced a use-after-free bug in the bus removal cleanup. The issue was found with kfence: [ 19.293351] BUG: KFENCE: use-after-free read in pci_bus_release_domain_nr+0x10/0x70 [ 19.302817] Use-after-free read at 0x000000007f3b80eb (in kfence-#115): [ 19.309677] pci_bus_release_domain_nr+0x10/0x70 [ 19.309691] dw_pcie_host_deinit+0x28/0x78 [ 19.309702] tegra_pcie_deinit_controller+0x1c/0x38 [pcie_tegra194] [ 19.309734] tegra_pcie_dw_probe+0x648/0xb28 [pcie_tegra194] [ 19.309752] platform_probe+0x90/0xd8 ... [ 19.311457] kfence-#115: 0x00000000063a155a-0x00000000ba698da8, size=1072, cache=kmalloc-2k [ 19.311469] allocated by task 96 on cpu 10 at 19.279323s: [ 19.311562] __kmem_cache_alloc_node+0x260/0x278 [ 19.311571] kmalloc_trace+0x24/0x30 [ 19.311580] pci_alloc_bus+0x24/0xa0 [ 19.311590] pci_register_host_bridge+0x48/0x4b8 [ 19.311601] pci_scan_root_bus_bridge+0xc0/0xe8 [ 19.311613] pci_host_probe+0x18/0xc0 [ 19.311623] dw_pcie_host_init+0x2c0/0x568 [ 19.311630] tegra_pcie_dw_probe+0x610/0xb28 [pcie_tegra194] [ 19.311647] platform_probe+0x90/0xd8 ... [ 19.311782] freed by task 96 on cpu 10 at 19.285833s: [ 19.311799] release_pcibus_dev+0x30/0x40 [ 19.311808] device_release+0x30/0x90 [ 19.311814] kobject_put+0xa8/0x120 [ 19.311832] device_unregister+0x20/0x30 [ 19.311839] pci_remove_bus+0x78/0x88 [ 19.311850] pci_remove_root_bus+0x5c/0x98 [ 19.311860] dw_pcie_host_deinit+0x28/0x78 [ 19.311866] tegra_pcie_deinit_controller+0x1c/0x38 [pcie_tegra194] [ 19.311883] tegra_pcie_dw_probe+0x648/0xb28 [pcie_tegra194] [ 19.311900] platform_probe+0x90/0xd8 ... [ 19.313579] CPU: 10 PID: 96 Comm: kworker/u24:2 Not tainted 6.2.0 #4 [ 19.320171] Hardware name: /, BIOS 1.0-d7fb19b 08/10/2022 [ 19.325852] Workqueue: events_unbound deferred_probe_work_func The stack trace is a bit misleading as dw_pcie_host_deinit() doesn't directly call pci_bus_release_domain_nr(). The issue turns out to be in pci_remove_root_bus() which first calls pci_remove_bus() which frees the struct pci_bus when its struct device is released. Then pci_bus_release_domain_nr() is called and accesses the freed struct pci_bus. Reordering these fixes the issue. Fixes: c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()") Link: https://lore.kernel.org/r/20230329123835.2724518-1-robh@kernel.org Link: https://lore.kernel.org/r/b529cb69-0602-9eed-fc02-2f068707a006@nvidia.com Reported-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: stable@vger.kernel.org # v6.2+ Cc: Pali Rohár <pali@kernel.org>
2023-04-06PCI/P2PDMA: Fix pci_p2pmem_find_many() kernel-docCai Huoqing
Remove reference to pci_p2pmem_dma(), which has never existed. Link: https://lore.kernel.org/r/20230329024731.5604-1-cai.huoqing@linux.dev Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2023-04-05PCI: Make pci_bus_for_each_resource() index optionalAndy Shevchenko
Refactor pci_bus_for_each_resource() in the same way as pci_dev_for_each_resource(). This allows the index to be hidden inside the implementation so the caller can omit it when it's not used otherwise. No functional changes intended. Link: https://lore.kernel.org/r/20230330162434.35055-6-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Krzysztof Wilczyński <kw@linux.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-04-04PCI: Introduce pci_dev_for_each_resource()Mika Westerberg
Instead of open-coding it everywhere introduce a tiny helper that can be used to iterate over each resource of a PCI device, and convert the most obvious users into it. While at it drop doubled empty line before pdev_sort_resources(). No functional changes intended. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230330162434.35055-4-andriy.shevchenko@linux.intel.com Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
2023-04-03PCI/DOE: Fix memory leak with CONFIG_DEBUG_OBJECTS=yLukas Wunner
After a pci_doe_task completes, its work_struct needs to be destroyed to avoid a memory leak with CONFIG_DEBUG_OBJECTS=y. Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions") Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: stable@vger.kernel.org # v6.0+ Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/775768b4912531c3b887d405fc51a50e465e1bf9.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-03PCI/DOE: Silence WARN splat with CONFIG_DEBUG_OBJECTS=yLukas Wunner
Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL probing because pci_doe_submit_task() invokes INIT_WORK() instead of INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack. All callers of pci_doe_submit_task() allocate the work_struct on the stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable short-term fix. The long-term fix implemented by a subsequent commit is to move to a synchronous API which allocates the work_struct internally in the DOE library. Stacktrace for posterity: WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183 CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Call Trace: pci_doe_submit_task+0x5d/0xd0 pci_doe_discovery+0xb4/0x100 pcim_doe_create_mb+0x219/0x290 cxl_pci_probe+0x192/0x430 local_pci_probe+0x41/0x80 pci_device_probe+0xb3/0x220 really_probe+0xde/0x380 __driver_probe_device+0x78/0x170 driver_probe_device+0x1f/0x90 __driver_attach_async_helper+0x5c/0xe0 async_run_entry_fn+0x30/0x130 process_one_work+0x294/0x5b0 Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions") Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/ Reported-by: Gregory Price <gregory.price@memverge.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Gregory Price <gregory.price@memverge.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Cc: stable@vger.kernel.org # v6.0+ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/67a9117f463ecdb38a2dbca6a20391ce2f1e7a06.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-03Merge 6.3-rc5 into driver-core-nextGreg Kroah-Hartman
We need the fixes in here for testing, as well as the driver core changes for documentation updates to build on. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-31Merge tag 'pci-v6.3-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI fix from Bjorn Helgaas: - Fix DesignWare PORT_LINK_CONTROL setup, which was corrupted when the DT "snps,enable-cdm-check" property was present (Yoshihiro Shimoda) * tag 'pci-v6.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI: dwc: Fix PORT_LINK_CONTROL update when CDM check enabled
2023-03-24PCI: ixp4xx: Use PCI_CONF1_ADDRESS() macroPali Rohár
Simplify pci-ixp4xx.c driver code and use new PCI_CONF1_ADDRESS() macro for accessing PCI config space. Link: https://lore.kernel.org/r/20220928122539.15116-1-pali@kernel.org Tested-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-03-24PCI: mt7621: Use dev_info() to log PCIe card detectionSergio Paracuellos
When there is no card plugged on a PCIe port a log reporting that the port will be disabled is flagged as an error (dev_err()). Since this is not an error at all, change the log level by using dev_info() instead. Link: https://lore.kernel.org/r/20230324073733.1596231-1-sergio.paracuellos@gmail.com Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
2023-03-24PCI: imx6: Install the fault handler only on compatible matchH. Nikolaus Schaller
commit bb38919ec56e ("PCI: imx6: Add support for i.MX6 PCIe controller") added a fault hook to this driver in the probe function. So it was only installed if needed. commit bde4a5a00e76 ("PCI: imx6: Allow probe deferral by reset GPIO") moved it from probe to driver init which installs the hook unconditionally as soon as the driver is compiled into a kernel. When this driver is compiled as a module, the hook is not registered until after the driver has been matched with a .compatible and loaded. commit 415b6185c541 ("PCI: imx6: Fix config read timeout handling") extended the fault handling code. commit 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ") added some protection for non-ARM architectures, but this does not protect non-i.MX ARM architectures. Since fault handlers can be triggered on any architecture for different reasons, there is no guarantee that they will be triggered only for the assumed situation, leading to improper error handling (i.MX6-specific imx6q_pcie_abort_handler) on foreign systems. I had seen strange L3 imprecise external abort messages several times on OMAP4 and OMAP5 devices and couldn't make sense of them until I realized they were related to this unused imx6q driver because I had CONFIG_PCI_IMX6=y. Note that CONFIG_PCI_IMX6=y is useful for kernel binaries that are designed to run on different ARM SoC and be differentiated only by device tree binaries. So turning off CONFIG_PCI_IMX6 is not a solution. Therefore we check the compatible in the init function before registering the fault handler. Link: https://lore.kernel.org/r/e1bcfc3078c82b53aa9b78077a89955abe4ea009.1678380991.git.hns@goldelico.com Fixes: bde4a5a00e76 ("PCI: imx6: Allow probe deferral by reset GPIO") Fixes: 415b6185c541 ("PCI: imx6: Fix config read timeout handling") Fixes: 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ") Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Richard Zhu <hongxing.zhu@nxp.com>
2023-03-23driver core: bus: mark the struct bus_type for sysfs callbacks as constantGreg Kroah-Hartman
struct bus_type should never be modified in a sysfs callback as there is nothing in the structure to modify, and frankly, the structure is almost never used in a sysfs callback, so mark it as constant to allow struct bus_type to be moved to read-only memory. Cc: "David S. Miller" <davem@davemloft.net> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Harald Freudenberger <freude@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hu Haowen <src.res@email.cn> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Stuart Yoder <stuyoder@gmail.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Acked-by: Ilya Dryomov <idryomov@gmail.com> # rbd Acked-by: Ira Weiny <ira.weiny@intel.com> # cxl Reviewed-by: Alex Shi <alexs@kernel.org> Acked-by: Iwona Winiarska <iwona.winiarska@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # scsi Link: https://lore.kernel.org/r/20230313182918.1312597-23-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-21cxl/pci: Fix CDAT retrieval on big endianLukas Wunner
The CDAT exposed in sysfs differs between little endian and big endian arches: On big endian, every 4 bytes are byte-swapped. PCI Configuration Space is little endian (PCI r3.0 sec 6.1). Accessors such as pci_read_config_dword() implicitly swap bytes on big endian. That way, the macros in include/uapi/linux/pci_regs.h work regardless of the arch's endianness. For an example of implicit byte-swapping, see ppc4xx_pciex_read_config(), which calls in_le32(), which uses lwbrx (Load Word Byte-Reverse Indexed). DOE Read/Write Data Mailbox Registers are unlike other registers in Configuration Space in that they contain or receive a 4 byte portion of an opaque byte stream (a "Data Object" per PCIe r6.0 sec 7.9.24.5f). They need to be copied to or from the request/response buffer verbatim. So amend pci_doe_send_req() and pci_doe_recv_resp() to undo the implicit byte-swapping. The CXL_DOE_TABLE_ACCESS_* and PCI_DOE_DATA_OBJECT_DISC_* macros assume implicit byte-swapping. Byte-swap requests after constructing them with those macros and byte-swap responses before parsing them. Change the request and response type to __le32 to avoid sparse warnings. Per a request from Jonathan, replace sizeof(u32) with sizeof(__le32) for consistency. Fixes: c97006046c79 ("cxl/port: Read CDAT table") Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: stable@vger.kernel.org # v6.0+ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/3051114102f41d19df3debbee123129118fc5e6d.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-03-21PCI: dwc: Fix PORT_LINK_CONTROL update when CDM check enabledYoshihiro Shimoda
If CDM_CHECK is enabled (by the DT "snps,enable-cdm-check" property), 'val' is overwritten by PCIE_PL_CHK_REG_CONTROL_STATUS initialization. Commit ec7b952f453c ("PCI: dwc: Always enable CDM check if "snps,enable-cdm-check" exists") did not account for further usage of 'val', so we wrote improper values to PCIE_PORT_LINK_CONTROL when the CDM check is enabled. Move the PCIE_PORT_LINK_CONTROL update to be completely after the PCIE_PL_CHK_REG_CONTROL_STATUS register initialization. [bhelgaas: commit log adapted from Serge's version] Fixes: ec7b952f453c ("PCI: dwc: Always enable CDM check if "snps,enable-cdm-check" exists") Link: https://lore.kernel.org/r/20230310123510.675685-2-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
2023-03-17PCI: layerscape: Add EP mode support for ls1028aXiaowei Bao
Add PCIe EP mode support for ls1028a. Link: https://lore.kernel.org/r/20230209151050.233973-1-Frank.Li@nxp.com Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Alok Tiwari <alok.a.tiwari@oracle.com> Acked-by: Roy Zang <Roy.Zang@nxp.com>
2023-03-17driver core: class: remove module * from class_create()Greg Kroah-Hartman
The module pointer in class_create() never actually did anything, and it shouldn't have been requred to be set as a parameter even if it did something. So just remove it and fix up all callers of the function in the kernel tree at the same time. Cc: "Rafael J. Wysocki" <rafael@kernel.org> Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20230313181843.1207845-4-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-13PCI: s390: Fix use-after-free of PCI resources with per-function hotplugNiklas Schnelle
On s390 PCI functions may be hotplugged individually even when they belong to a multi-function device. In particular on an SR-IOV device VFs may be removed and later re-added. In commit a50297cf8235 ("s390/pci: separate zbus creation from scanning") it was missed however that struct pci_bus and struct zpci_bus's resource list retained a reference to the PCI functions MMIO resources even though those resources are released and freed on hot-unplug. These stale resources may subsequently be claimed when the PCI function re-appears resulting in use-after-free. One idea of fixing this use-after-free in s390 specific code that was investigated was to simply keep resources around from the moment a PCI function first appeared until the whole virtual PCI bus created for a multi-function device disappears. The problem with this however is that due to the requirement of artificial MMIO addreesses (address cookies) extra logic is then needed to keep the address cookies compatible on re-plug. At the same time the MMIO resources semantically belong to the PCI function so tying their lifecycle to the function seems more logical. Instead a simpler approach is to remove the resources of an individually hot-unplugged PCI function from the PCI bus's resource list while keeping the resources of other PCI functions on the PCI bus untouched. This is done by introducing pci_bus_remove_resource() to remove an individual resource. Similarly the resource also needs to be removed from the struct zpci_bus's resource list. It turns out however, that there is really no need to add the MMIO resources to the struct zpci_bus's resource list at all and instead we can simply use the zpci_bar_struct's resource pointer directly. Fixes: a50297cf8235 ("s390/pci: separate zbus creation from scanning") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230306151014.60913-2-schnelle@linux.ibm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-03-10PCI: rcar: Avoid defines prefixed with CONFIGLukas Bulwahn
Defines prefixed with "CONFIG" should be limited to proper Kconfig options, that are introduced in a Kconfig file. In the R-car driver the bitmask to configure the SEND_ENABLE mode is named CONFIG_SEND_ENABLE. Rename this local definition to a more suitable name, containing the register bitfield name defined in the R-Car Gen3 rev. 2.30 user manual. No functional change. Link: https://lore.kernel.org/r/20230113084516.31888-1-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> [lpieralisi@kernel.org: Changed define naming and commit log] Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
2023-03-10PCI: kirin: Select REGMAP_MMIOJosh Triplett
pcie-kirin uses regmaps, and needs to pull them in; otherwise, with CONFIG_PCIE_KIRIN=y and without CONFIG_REGMAP_MMIO pcie-kirin produces a linker failure looking for __devm_regmap_init_mmio_clk(). Fixes: d19afe7be126 ("PCI: kirin: Use regmap for APB registers") Link: https://lore.kernel.org/r/04636141da1d6d592174eefb56760511468d035d.1668410580.git.josh@joshtriplett.org Signed-off-by: Josh Triplett <josh@joshtriplett.org> [lpieralisi@kernel.org: commit log and removed REGMAP select] Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: stable@vger.kernel.org # 5.16+
2023-03-05Merge tag 'irq-urgent-2023-03-05' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "A set of updates for the interrupt susbsystem: - Prevent possible NULL pointer derefences in irq_data_get_affinity_mask() and irq_domain_create_hierarchy() - Take the per device MSI lock before invoking code which relies on it being hold - Make sure that MSI descriptors are unreferenced before freeing them. This was overlooked when the platform MSI code was converted to use core infrastructure and results in a fals positive warning - Remove dead code in the MSI subsystem - Clarify the documentation for pci_msix_free_irq() - More kobj_type constification" * tag 'irq-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/msi, platform-msi: Ensure that MSI descriptors are unreferenced genirq/msi: Drop dead domain name assignment irqdomain: Add missing NULL pointer check in irq_domain_create_hierarchy() genirq/irqdesc: Make kobj_type structures constant PCI/MSI: Clarify usage of pci_msix_free_irq() genirq/msi: Take the per-device MSI lock before validating the control structure genirq/ipi: Fix NULL pointer deref in irq_data_get_affinity_mask()
2023-02-25Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio updates from Michael Tsirkin: - device feature provisioning in ifcvf, mlx5 - new SolidNET driver - support for zoned block device in virtio blk - numa support in virtio pmem - VIRTIO_F_RING_RESET support in vhost-net - more debugfs entries in mlx5 - resume support in vdpa - completion batching in virtio blk - cleanup of dma api use in vdpa - now simulating more features in vdpa-sim - documentation, features, fixes all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (64 commits) vdpa/mlx5: support device features provisioning vdpa/mlx5: make MTU/STATUS presence conditional on feature bits vdpa: validate device feature provisioning against supported class vdpa: validate provisioned device features against specified attribute vdpa: conditionally read STATUS in config space vdpa: fix improper error message when adding vdpa dev vdpa/mlx5: Initialize CVQ iotlb spinlock vdpa/mlx5: Don't clear mr struct on destroy MR vdpa/mlx5: Directly assign memory key tools/virtio: enable to build with retpoline vringh: fix a typo in comments for vringh_kiov vhost-vdpa: print warning when vhost_vdpa_alloc_domain fails scsi: virtio_scsi: fix handling of kmalloc failure vdpa: Fix a couple of spelling mistakes in some messages vhost-net: support VIRTIO_F_RING_RESET vhost-scsi: convert sysfs snprintf and sprintf to sysfs_emit vdpa: mlx5: support per virtqueue dma device vdpa: set dma mask for vDPA device virtio-vdpa: support per vq dma device vdpa: introduce get_vq_dma_device() ...
2023-02-25Merge tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxlLinus Torvalds
Pull Compute Express Link (CXL) updates from Dan Williams: "To date Linux has been dependent on platform-firmware to map CXL RAM regions and handle events / errors from devices. With this update we can now parse / update the CXL memory layout, and report events / errors from devices. This is a precursor for the CXL subsystem to handle the end-to-end "RAS" flow for CXL memory. i.e. the flow that for DDR-attached-DRAM is handled by the EDAC driver where it maps system physical address events to a field-replaceable-unit (FRU / endpoint device). In general, CXL has the potential to standardize what has historically been a pile of memory-controller-specific error handling logic. Another change of note is the default policy for handling RAM-backed device-dax instances. Previously the default access mode was "device", mmap(2) a device special file to access memory. The new default is "kmem" where the address range is assigned to the core-mm via add_memory_driver_managed(). This saves typical users from wondering why their platform memory is not visible via free(1) and stuck behind a device-file. At the same time it allows expert users to deploy policy to, for example, get dedicated access to high performance memory, or hide low performance memory from general purpose kernel allocations. This affects not only CXL, but also systems with high-bandwidth-memory that platform-firmware tags with the EFI_MEMORY_SP (special purpose) designation. Summary: - CXL RAM region enumeration: instantiate 'struct cxl_region' objects for platform firmware created memory regions - CXL RAM region provisioning: complement the existing PMEM region creation support with RAM region support - "Soft Reservation" policy change: Online (memory hot-add) soft-reserved memory (EFI_MEMORY_SP) by default, but still allow for setting aside such memory for dedicated access via device-dax. - CXL Events and Interrupts: Takeover CXL event handling from platform-firmware (ACPI calls this CXL Memory Error Reporting) and export CXL Events via Linux Trace Events. - Convey CXL _OSC results to drivers: Similar to PCI, let the CXL subsystem interrogate the result of CXL _OSC negotiation. - Emulate CXL DVSEC Range Registers as "decoders": Allow for first-generation devices that pre-date the definition of the CXL HDM Decoder Capability to translate the CXL DVSEC Range Registers into 'struct cxl_decoder' objects. - Set timestamp: Per spec, set the device timestamp in case of hotplug, or if platform-firwmare failed to set it. - General fixups: linux-next build issues, non-urgent fixes for pre-production hardware, unit test fixes, spelling and debug message improvements" * tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (66 commits) dax/kmem: Fix leak of memory-hotplug resources cxl/mem: Add kdoc param for event log driver state cxl/trace: Add serial number to trace points cxl/trace: Add host output to trace points cxl/trace: Standardize device information output cxl/pci: Remove locked check for dvsec_range_allowed() cxl/hdm: Add emulation when HDM decoders are not committed cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders cxl/hdm: Emulate HDM decoder from DVSEC range registers cxl/pci: Refactor cxl_hdm_decode_init() cxl/port: Export cxl_dvsec_rr_decode() to cxl_port cxl/pci: Break out range register decoding from cxl_hdm_decode_init() cxl: add RAS status unmasking for CXL cxl: remove unnecessary calling of pci_enable_pcie_error_reporting() dax/hmem: build hmem device support as module if possible dax: cxl: add CXL_REGION dependency cxl: avoid returning uninitialized error code cxl/pmem: Fix nvdimm registration races cxl/mem: Fix UAPI command comment cxl/uapi: Tag commands from cxl_query_cmd() ...
2023-02-24Merge tag 'phy-for-6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy updates from Vinod Koul: "This features a bunch of new device support, a couple of new drivers, yaml conversion and updates of a few drivers. Core support: - New devm_of_phy_optional_get() API with users and conversion New hardware support: - Mediatek MT7986 phy support - Qualcomm SM8550 UFS, PCIe, combo phy support, SM6115 / SM4250 USB3 phy support, SM6350 combo phy support, SM6125 UFS PHY support amd SM8350 & SM8450 combo phy support - Qualcomm SNPS eUSB2 eUSB2 repeater drivers - Allwinner F1C100s USB PHY support - Tegra xusb support for Tegra234 Updates: - Yaml conversion for Qualcomm pcie2 phy and usb-hsic-phy - G4 mode support in Qualcomm UFS phy and support for various SoCs - Yaml conversion for Meson usb2 phy - TI Type C support for usb phy for j721 - Yaml conversion for Tegra xusb binding" * tag 'phy-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (106 commits) phy: qcom: phy-qcom-snps-eusb2: Add support for eUSB2 repeater phy: qcom: Add QCOM SNPS eUSB2 repeater driver dt-bindings: phy: qcom,snps-eusb2-phy: Add phys property for the repeater dt-bindings: phy: Add qcom,snps-eusb2-repeater schema file dt-bindings: phy: amlogic,g12a-usb3-pcie-phy: add missing optional phy-supply property phy: rockchip-typec: Fix unsigned comparison with less than zero phy: rockchip-typec: fix tcphy_get_mode error case phy: qcom: snps-eusb2: Add missing headers phy: qcom-qmp-combo: Add support for SM8550 phy: qcom-qmp: Add v6 DP register offsets phy: qcom-qmp: pcs-usb: Add v6 register offsets dt-bindings: phy: qcom,sc8280xp-qmp-usb43dp: Document SM8550 compatible phy: qcom: Add QCOM SNPS eUSB2 driver dt-bindings: phy: Add qcom,snps-eusb2-phy schema file phy: qcom-qmp-pcie: Add support for SM8550 g3x2 and g4x2 PCIEs phy: qcom-qmp: qserdes-lane-shared: Add v6 register offsets phy: qcom-qmp: qserdes-txrx: Add v6.20 register offsets phy: qcom-qmp: pcs-pcie: Add v6.20 register offsets phy: qcom-qmp: pcs-pcie: Add v6 register offsets phy: qcom-qmp: pcs: Add v6.20 register offsets ...
2023-02-24Merge tag 'pci-v6.3-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Rework portdrv shutdown so it disables interrupts but doesn't disable bus mastering, which leads to hangs on Loongson LS7A - Add mechanism to prevent Max_Read_Request_Size (MRRS) increases, again to avoid hardware issues on Loongson LS7A (and likely other devices based on DesignWare IP) - Ignore devices with a firmware (DT or ACPI) node that says the device is disabled Resource management: - Distribute spare resources to unconfigured hotplug bridges at boot-time (not just when hot-adding such a bridge), which makes hot-adding devices to docks work better. Tried this in v6.1 but had to revert for regressions, so try again - Fix root bus issue that dropped resources that happened to end at 0, e.g., [bus 00] PCI device hotplug: - Remove device locking when marking device as disconnected so this doesn't have to wait for concurrent driver bind/unbind to complete - Quirk more Qualcomm bridges that don't fully implement the PCIe Slot Status 'Command Completed' bit Power management: - Account for _S0W of the target bridge in acpi_pci_bridge_d3() so we don't miss hot-add notifications for USB4 docks, Thunderbolt, etc Reset: - Observe delay after reset, e.g., resuming from system sleep, regardless of whether a bridge can suspend to D3cold at runtime - Wait for secondary bus to become ready after a bridge reset Virtualization: - Avoid FLR on some AMD FCH AHCI adapters where it doesn't work - Allow independent IOMMU groups for some Wangxun NICs that prevent peer-to-peer transactions but don't advertise an ACS Capability Error handling: - Configure End-to-End-CRC (ECRC) only if Linux owns the AER Capability - Remove redundant Device Control Error Reporting Enable in the AER service driver since this is already done for all devices during enumeration ASPM: - Add pci_enable_link_state() interface to allow drivers to enable ASPM link state Endpoint framework: - Move dra7xx and tegra194 linkup processing from hard IRQ to threaded IRQ handler - Add a separate lock for endpoint controller list of endpoint function drivers to prevent deadlock in callbacks - Pass events from endpoint controller to endpoint function drivers via callbacks instead of notifiers Synopsys DesignWare eDMA controller driver (acked by Vinod): - Fix CPU vs PCI address issues - Fix source vs destination address issues - Fix issues with interleaved transfer semantics - Fix channel count initialization issue (issue still exists in several other drivers) - Clean up and improve debugfs usage so it will work on platforms with several eDMA devices Baikal T-1 PCIe controller driver: - Set a 64-bit DMA mask Freescale i.MX6 PCIe controller driver: - Add i.MX8MM, i.MX8MQ, i.MX8MP endpoint mode DT binding and driver support Intel VMD host bridge driver: - Add quirk to configure PCIe ASPM and LTR. This is normally done by BIOS, and will be for future products Marvell MVEBU PCIe controller driver: - Mark this driver as broken in Kconfig since bugs prevent its daily usage MediaTek MT7621 PCIe controller driver: - Delay PHY port initialization to improve boot reliability for ZBT WE1326, ZBT WF3526-P, and some Netgear models Qualcomm PCIe controller driver: - Add MSM8998 DT compatible string - Unify MSM8996 and MSM8998 clock orderings - Add SM8350 DT binding and driver support - Add IPQ8074 Gen3 DT binding and driver support - Correct qcom,perst-regs in DT binding - Add qcom_pcie_host_deinit() so the PHY is powered off and regulators and clocks are disabled on late host-init errors Socionext UniPhier Pro5 controller driver: - Clean up uniphier-ep reg, clocks, resets, and their names in DT binding Synopsys DesignWare PCIe controller driver: - Restrict coherent DMA mask to 32 bits for MSI, but allow controller drivers to set 64-bit streaming DMA mask - Add eDMA engine support in both Root Port and Endpoint controllers Miscellaneous: - Remove MODULE_LICENSE from boolean drivers so they don't look like modules so modprobe can complain about them" * tag 'pci-v6.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (86 commits) PCI: dwc: Add Root Port and Endpoint controller eDMA engine support PCI: bt1: Set 64-bit DMA mask PCI: dwc: Restrict only coherent DMA mask for MSI address allocation dmaengine: dw-edma: Prepare dw_edma_probe() for builtin callers dmaengine: dw-edma: Depend on DW_EDMA instead of selecting it dmaengine: dw-edma: Add mem-mapped LL-entries support PCI: Remove MODULE_LICENSE so boolean drivers don't look like modules PCI: hv: Drop duplicate PCI_MSI dependency PCI/P2PDMA: Annotate RCU dereference PCI/sysfs: Constify struct kobj_type pci_slot_ktype PCI: hotplug: Allow marking devices as disconnected during bind/unbind PCI: pciehp: Add Qualcomm quirk for Command Completed erratum PCI: qcom: Add IPQ8074 Gen3 port support dt-bindings: PCI: qcom: Add IPQ8074 Gen3 port dt-bindings: PCI: qcom: Sort compatibles alphabetically PCI: qcom: Fix host-init error handling PCI: qcom: Add SM8350 support dt-bindings: PCI: qcom: Add SM8350 dt-bindings: PCI: qcom-ep: Correct qcom,perst-regs dt-bindings: PCI: qcom: Unify MSM8996 and MSM8998 clock order ...
2023-02-24Merge tag 'driver-core-6.3-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the large set of driver core changes for 6.3-rc1. There's a lot of changes this development cycle, most of the work falls into two different categories: - fw_devlink fixes and updates. This has gone through numerous review cycles and lots of review and testing by lots of different devices. Hopefully all should be good now, and Saravana will be keeping a watch for any potential regression on odd embedded systems. - driver core changes to work to make struct bus_type able to be moved into read-only memory (i.e. const) The recent work with Rust has pointed out a number of areas in the driver core where we are passing around and working with structures that really do not have to be dynamic at all, and they should be able to be read-only making things safer overall. This is the contuation of that work (started last release with kobject changes) in moving struct bus_type to be constant. We didn't quite make it for this release, but the remaining patches will be finished up for the release after this one, but the groundwork has been laid for this effort. Other than that we have in here: - debugfs memory leak fixes in some subsystems - error path cleanups and fixes for some never-able-to-be-hit codepaths. - cacheinfo rework and fixes - Other tiny fixes, full details are in the shortlog All of these have been in linux-next for a while with no reported problems" [ Geert Uytterhoeven points out that that last sentence isn't true, and that there's a pending report that has a fix that is queued up - Linus ] * tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (124 commits) debugfs: drop inline constant formatting for ERR_PTR(-ERROR) OPP: fix error checking in opp_migrate_dentry() debugfs: update comment of debugfs_rename() i3c: fix device.h kernel-doc warnings dma-mapping: no need to pass a bus_type into get_arch_dma_ops() driver core: class: move EXPORT_SYMBOL_GPL() lines to the correct place Revert "driver core: add error handling for devtmpfs_create_node()" Revert "devtmpfs: add debug info to handle()" Revert "devtmpfs: remove return value of devtmpfs_delete_node()" driver core: cpu: don't hand-override the uevent bus_type callback. devtmpfs: remove return value of devtmpfs_delete_node() devtmpfs: add debug info to handle() driver core: add error handling for devtmpfs_create_node() driver core: bus: update my copyright notice driver core: bus: add bus_get_dev_root() function driver core: bus: constify bus_unregister() driver core: bus: constify some internal functions driver core: bus: constify bus_get_kset() driver core: bus: constify bus_register/unregister_notifier() driver core: remove private pointer from struct bus_type ...
2023-02-22Merge branch 'pci/misc'Bjorn Helgaas
- Drop bogus kernel-doc marker in pci_endpoint_test.c (Randy Dunlap) - Fix epf_ntb_mw_bar_clear() kernel-doc (Yang Yingliang) - Constify struct kobj_type pci_slot_ktype (Thomas Weißschuh) * pci/misc: PCI: hv: Drop duplicate PCI_MSI dependency PCI/sysfs: Constify struct kobj_type pci_slot_ktype PCI: endpoint: pci-epf-vntb: Add epf_ntb_mw_bar_clear() num_mws kernel-doc misc: pci_endpoint_test: Drop initial kernel-doc marker
2023-02-22Merge branch 'pci/controller/vmd'Bjorn Helgaas
- Add pci_enable_link_state() to allow drivers to enable ASPM link state (Michael Bottini) - Add quirk to enable all ASPM link states and program LTR for devices below VMD (David E. Box) * pci/controller/vmd: PCI: vmd: Add quirk to configure PCIe ASPM and LTR PCI: vmd: Create feature grouping for client products PCI: vmd: Use PCI_VDEVICE in device list PCI/ASPM: Add pci_enable_link_state()