summaryrefslogtreecommitdiff
path: root/drivers/vfio/pci/vfio_pci.c
AgeCommit message (Collapse)Author
2017-12-20vfio-pci: Mask INTx if a device is not capabable of enabling itAlexey Kardashevskiy
At the moment VFIO rightfully assumes that INTx is supported if the interrupt pin is not set to zero in the device config space. However if that is not the case (the pin is not zero but pdev->irq is), vfio_intx_enable() fails. In order to prevent the userspace from trying to enable INTx when we know that it cannot work, let's mask the PCI_INTERRUPT_PIN register. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-07-26vfio/pci: Use pci_try_reset_function() on initial openAlex Williamson
Device lock bites again; if a device .remove() callback races a user calling ioctl(VFIO_GROUP_GET_DEVICE_FD), the unbind request will hold the device lock, but the user ioctl may have already taken a vfio_device reference. In the case of a PCI device, the initial open will attempt to reset the device, which again attempts to get the device lock, resulting in deadlock. Use the trylock PCI reset interface and return error on the open path if reset fails due to lock contention. Link: https://lkml.org/lkml/2017/7/25/381 Reported-by: Wen Congyang <wencongyang2@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-06-13vfio/pci: Add Intel XXV710 to hidden INTx devicesAlex Williamson
XXV710 has the same broken INTx behavior as the rest of the X/XL710 series, the interrupt status register is not wired to report pending INTx interrupts, thus we never associate the interrupt to the device. Extend the device IDs to include these so that we hide that the device supports INTx at all to the user. Reported-by: Stefan Assmann <sassmann@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
2017-01-04vfio-pci: Handle error from pci_iomapArvind Yadav
Here, pci_iomap can fail, handle this case release selected pci regions and return -ENOMEM. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17vfio_pci: Updated to use vfio_set_irqs_validate_and_prepare()Kirti Wankhede
Updated vfio_pci.c file to use vfio_set_irqs_validate_and_prepare() Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17vfio_pci: Update vfio_pci to use vfio_info_add_capability()Kirti Wankhede
Update msix_sparse_mmap_cap() to use vfio_info_add_capability() Update region type capability to use vfio_info_add_capability() Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-10-26vfio/pci: Fix integer overflows, bitmask checkVlad Tsyrklevich
The VFIO_DEVICE_SET_IRQS ioctl did not sufficiently sanitize user-supplied integers, potentially allowing memory corruption. This patch adds appropriate integer overflow checks, checks the range bounds for VFIO_IRQ_SET_DATA_NONE, and also verifies that only single element in the VFIO_IRQ_SET_DATA_TYPE_MASK bitmask is set. VFIO_IRQ_SET_ACTION_TYPE_MASK is already correctly checked later in vfio_pci_set_irqs_ioctl(). Furthermore, a kzalloc is changed to a kcalloc because the use of a kzalloc with an integer multiplication allowed an integer overflow condition to be reached without this patch. kcalloc checks for overflow and should prevent a similar occurrence. Signed-off-by: Vlad Tsyrklevich <vlad@tsyrklevich.net> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-07-08vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusiveYongji Xie
Current vfio-pci implementation disallows to mmap sub-page(size < PAGE_SIZE) MMIO BARs because these BARs' mmio page may be shared with other BARs. This will cause some performance issues when we passthrough a PCI device with this kind of BARs. Guest will be not able to handle the mmio accesses to the BARs which leads to mmio emulations in host. However, not all sub-page BARs will share page with other BARs. We should allow to mmap the sub-page MMIO BARs which we can make sure will not share page with other BARs. This patch adds support for this case. And we try to add a dummy resource to reserve the remainder of the page which hot-add device's BAR might be assigned into. But it's not necessary to handle the case when the BAR is not page aligned. Because we can't expect the BAR will be assigned into the same location in a page in guest when we passthrough the BAR. And it's hard to access this BAR in userspace because we have no way to get the BAR's location in a page. Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-04-28vfio/pci: Hide broken INTx support from userAlex Williamson
INTx masking has two components, the first is that we need the ability to prevent the device from continuing to assert INTx. This is provided via the DisINTx bit in the command register and is the only thing we can really probe for when testing if INTx masking is supported. The second component is that the device needs to indicate if INTx is asserted via the interrupt status bit in the device status register. With these two features we can generically determine if one of the devices we own is asserting INTx, signal the user, and mask the interrupt while the user services the device. Generally if one or both of these components is broken we resort to APIC level interrupt masking, which requires an exclusive interrupt since we have no way to determine the source of the interrupt in a shared configuration. This often makes it difficult or impossible to configure the system for userspace use of the device, for an interrupt mode that the user may not need. One possible configuration of broken INTx masking is that the DisINTx support is fully functional, but the interrupt status bit never signals interrupt assertion. In this case we do have the ability to prevent the device from asserting INTx, but lack the ability to identify the interrupt source. For this case we can simply pretend that the device lacks INTx support entirely, keeping DisINTx set on the physical device, virtualizing this bit for the user, and virtualizing the interrupt pin register to indicate no INTx support. We already support virtualization of the DisINTx bit and already virtualize the interrupt pin for platforms without INTx support. By tying these components together, setting DisINTx on open and reset, and identifying devices broken in this particular way, we can provide support for them w/o the handicap of APIC level INTx masking. Intel i40e (XL710/X710) 10/20/40GbE NICs have been identified as being broken in this specific way. We leave the vfio-pci.nointxmask option as a mechanism to bypass this support, enabling INTx on the device with all the requirements of APIC level masking. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
2016-03-17Merge tag 'vfio-v4.6-rc1' of git://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO updates from Alex Williamson: "Various enablers for assignment of Intel graphics devices and future support of vGPU devices (Alex Williamson). This includes - Handling the vfio type1 interface as an API rather than a specific implementation, allowing multiple type1 providers. - Capability chains, similar to PCI device capabilities, that allow extending ioctls. Extensions here include device specific regions and sparse mmap descriptions. The former is used to expose non-PCI regions for IGD, including the OpRegion (particularly the Video BIOS Table), and read only PCI config access to the host and LPC bridge as drivers often depend on identifying those devices. Sparse mmaps here are used to describe the MSIx vector table, which vfio has always protected from mmap, but never had an API to explicitly define that protection. In future vGPU support this is expected to allow the description of PCI BARs that may mix direct access and emulated access within a single region. - The ability to expose the shadow ROM as an option ROM as IGD use cases may rely on the ROM even though the physical device does not make use of a PCI option ROM BAR" * tag 'vfio-v4.6-rc1' of git://github.com/awilliam/linux-vfio: vfio/pci: return -EFAULT if copy_to_user fails vfio/pci: Expose shadow ROM as PCI option ROM vfio/pci: Intel IGD host and LCP bridge config space access vfio/pci: Intel IGD OpRegion support vfio/pci: Enable virtual register in PCI config space vfio/pci: Add infrastructure for additional device specific regions vfio: Define device specific region type capability vfio/pci: Include sparse mmap capability for MSI-X table regions vfio: Define sparse mmap capability for regions vfio: Add capability chain helpers vfio: Define capability chains vfio: If an IOMMU backend fails, keep looking vfio/pci: Fix unsigned comparison overflow
2016-02-28vfio: fix ioctl error handlingMichael S. Tsirkin
Calling return copy_to_user(...) in an ioctl will not do the right thing if there's a pagefault: copy_to_user returns the number of bytes not copied in this case. Fix up vfio to do return copy_to_user(...)) ? -EFAULT : 0; everywhere. Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-25vfio/pci: return -EFAULT if copy_to_user failsDan Carpenter
The copy_to_user() function returns the number of bytes that were not copied but we want to return -EFAULT on error here. Fixes: 188ad9d6cbbc ('vfio/pci: Include sparse mmap capability for MSI-X table regions') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-22vfio/pci: Expose shadow ROM as PCI option ROMAlex Williamson
Integrated graphics may have their ROM shadowed at 0xc0000 rather than implement a PCI option ROM. Make this ROM appear to the user using the ROM BAR. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-22vfio/pci: Intel IGD host and LCP bridge config space accessAlex Williamson
Provide read-only access to PCI config space of the PCI host bridge and LPC bridge through device specific regions. This may be used to configure a VM with matching register contents to satisfy driver requirements. Providing this through the vfio file descriptor removes an additional userspace requirement for access through pci-sysfs and removes the CAP_SYS_ADMIN requirement that doesn't appear to apply to the specific devices we're accessing. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-22vfio/pci: Intel IGD OpRegion supportAlex Williamson
This is the first consumer of vfio device specific resource support, providing read-only access to the OpRegion for Intel graphics devices. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-22vfio/pci: Add infrastructure for additional device specific regionsAlex Williamson
Add support for additional regions with indexes started after the already defined fixed regions. Device specific code can register these regions with the new vfio_pci_register_dev_region() function. The ops structure per region currently only includes read/write access and a release function, allowing automatic cleanup when the device is closed. mmap support is only missing here because it's not needed by the first user queued for this support. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-22vfio/pci: Include sparse mmap capability for MSI-X table regionsAlex Williamson
vfio-pci has never allowed the user to directly mmap the MSI-X vector table, but we've always relied on implicit knowledge of the user that they cannot do this. Now that we have capability chains that we can expose in the region info ioctl and a sparse mmap capability that represents the sub-areas within the region that can be mmap'd, we can make the mmap constraints more explicit. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-12-21vfio: Include No-IOMMU modeAlex Williamson
There is really no way to safely give a user full access to a DMA capable device without an IOMMU to protect the host system. There is also no way to provide DMA translation, for use cases such as device assignment to virtual machines. However, there are still those users that want userspace drivers even under those conditions. The UIO driver exists for this use case, but does not provide the degree of device access and programming that VFIO has. In an effort to avoid code duplication, this introduces a No-IOMMU mode for VFIO. This mode requires building VFIO with CONFIG_VFIO_NOIOMMU and enabling the "enable_unsafe_noiommu_mode" option on the vfio driver. This should make it very clear that this mode is not safe. Additionally, CAP_SYS_RAWIO privileges are necessary to work with groups and containers using this mode. Groups making use of this support are named /dev/vfio/noiommu-$GROUP and can only make use of the special VFIO_NOIOMMU_IOMMU for the container. Use of this mode, specifically binding a device without a native IOMMU group to a VFIO bus driver will taint the kernel and should therefore not be considered supported. This patch includes no-iommu support for the vfio-pci bus driver only. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-04Revert: "vfio: Include No-IOMMU mode"Alex Williamson
Revert commit 033291eccbdb ("vfio: Include No-IOMMU mode") due to lack of a user. This was originally intended to fill a need for the DPDK driver, but uptake has been slow so rather than support an unproven kernel interface revert it and revisit when userspace catches up. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-11-19vfio-pci: constify pci_error_handlers structuresJulia Lawall
This pci_error_handlers structure is never modified, like all the other pci_error_handlers structures, so declare it as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-11-04vfio: Include No-IOMMU modeAlex Williamson
There is really no way to safely give a user full access to a DMA capable device without an IOMMU to protect the host system. There is also no way to provide DMA translation, for use cases such as device assignment to virtual machines. However, there are still those users that want userspace drivers even under those conditions. The UIO driver exists for this use case, but does not provide the degree of device access and programming that VFIO has. In an effort to avoid code duplication, this introduces a No-IOMMU mode for VFIO. This mode requires building VFIO with CONFIG_VFIO_NOIOMMU and enabling the "enable_unsafe_noiommu_mode" option on the vfio driver. This should make it very clear that this mode is not safe. Additionally, CAP_SYS_RAWIO privileges are necessary to work with groups and containers using this mode. Groups making use of this support are named /dev/vfio/noiommu-$GROUP and can only make use of the special VFIO_NOIOMMU_IOMMU for the container. Use of this mode, specifically binding a device without a native IOMMU group to a VFIO bus driver will taint the kernel and should therefore not be considered supported. This patch includes no-iommu support for the vfio-pci bus driver only. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2015-06-09vfio/pci: Fix racy vfio_device_get_from_dev() callAlex Williamson
Testing the driver for a PCI device is racy, it can be all but complete in the release path and still report the driver as ours. Therefore we can't trust drvdata to be valid. This race can sometimes be seen when one port of a multifunction device is being unbound from the vfio-pci driver while another function is being released by the user and attempting a bus reset. The device in the remove path is found as a dependent device for the bus reset of the release path device, the driver is still set to vfio-pci, but the drvdata has already been cleared, resulting in a null pointer dereference. To resolve this, fix vfio_device_get_from_dev() to not take the dev_get_drvdata() shortcut and instead traverse through the iommu_group, vfio_group, vfio_device path to get a reference we can trust. Once we have that reference, we know the device isn't in transition and we can test to make sure the driver is still what we expect, so that we don't interfere with devices we don't own. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-05-01vfio-pci: Log device requests more verboselyAlex Williamson
Log some clues indicating whether the user is receiving device request interfaces or not listening. This can help indicate why a driver unbind is blocked or explain why QEMU automatically unplugged a device from the VM. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-04-08vfio-pci: Fix use after freeAlex Williamson
Reported by 0-day test infrastructure. Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client") Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-04-07vfio-pci: Move idle devices to D3hot power stateAlex Williamson
We can save some power by putting devices that are bound to vfio-pci but not in use by the user in the D3hot power state. Devices get woken into D0 when opened by the user. Resets return the device to D0, so we need to re-apply the low power state after a bus reset. It's tempting to try to use D3cold, but we have no reason to inhibit hotplug of idle devices and we might get into a loop of having the device disappear before we have a chance to try to use it. A new module parameter allows this feature to be disabled if there are devices that misbehave as a result of this change. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-04-07vfio-pci: Remove warning if try-reset failsAlex Williamson
As indicated in the comment, this is not entirely uncommon and causes user concern for no reason. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-04-07vfio-pci: Allow PCI IDs to be specified as module optionsAlex Williamson
This copies the same support from pci-stub for exactly the same purpose, enabling a set of PCI IDs to be automatically added to the driver's dynamic ID table at module load time. The code here is pretty simple and both vfio-pci and pci-stub are fairly unique in being meta drivers, capable of attaching to any device, so there's no attempt made to generalize the code into pci-core. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-04-07vfio-pci: Add VGA arbiter clientAlex Williamson
If VFIO VGA access is disabled for the user, either by CONFIG option or module parameter, we can often opt-out of VGA arbitration. We can do this when PCI bridge control of VGA routing is possible. This means that we must have a parent bridge and there must only be a single VGA device below that bridge. Fortunately this is the typical case for discrete GPUs. Doing this allows us to minimize the impact of additional GPUs, in terms of VGA arbitration, when they are only used via vfio-pci for non-VGA applications. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-04-07vfio-pci: Add module option to disable VGA region accessAlex Williamson
Add a module option so that we don't require a CONFIG change and kernel rebuild to disable VGA support. Not only can VGA support be troublesome in itself, but by disabling it we can reduce the impact to host devices by doing a VGA arbitration opt-out. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16vfio: initialize the virqfd workqueue in VFIO generic codeAntonios Motakis
Now we have finally completely decoupled virqfd from VFIO_PCI. We can initialize it from the VFIO generic code, in order to safely use it from multiple independent VFIO bus drivers. Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com> Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Tested-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16vfio: virqfd: rename vfio_pci_virqfd_init and vfio_pci_virqfd_exitAntonios Motakis
The functions vfio_pci_virqfd_init and vfio_pci_virqfd_exit are not really PCI specific, since we plan to reuse the virqfd code with more VFIO drivers in addition to VFIO_PCI. Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com> [Baptiste Reynal: Move rename vfio_pci_virqfd_init and vfio_pci_virqfd_exit from "vfio: add a vfio_ prefix to virqfd_enable and virqfd_disable and export"] Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Tested-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-10vfio-pci: Add device request interfaceAlex Williamson
Userspace can opt to receive a device request notification, indicating that the device should be released. This is setup the same way as the error IRQ and also supports eventfd signaling. Future support may forcefully remove the device from the user if the request is ignored. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-01-07vfio-pci: Fix the check on pci device type in vfio_pci_probe()Wei Yang
Current vfio-pci just supports normal pci device, so vfio_pci_probe() will return if the pci device is not a normal device. While current code makes a mistake. PCI_HEADER_TYPE is the offset in configuration space of the device type, but we use this value to mask the type value. This patch fixs this by do the check directly on the pci_dev->hdr_type. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: stable@vger.kernel.org # v3.6+
2014-11-07vfio: make vfio run on s390Frank Blaschka
add Kconfig switch to hide INTx add Kconfig switch to let vfio announce PCI BARs are not mapable Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-09-29vfio-pci: Fix remove path lockingAlex Williamson
Locking both the remove() and release() path results in a deadlock that should have been obvious. To fix this we can get and hold the vfio_device reference as we evaluate whether to do a bus/slot reset. This will automatically block any remove() calls, allowing us to remove the explict lock. Fixes 61d792562b53. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: stable@vger.kernel.org [3.17]
2014-08-08drivers/vfio: Enable VFIO if EEH is not supportedAlexey Kardashevskiy
The existing vfio_pci_open() fails upon error returned from vfio_spapr_pci_eeh_open(), which breaks POWER7's P5IOC2 PHB support which this patch brings back. The patch fixes the issue by dropping the return value of vfio_spapr_pci_eeh_open(). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-07vfio-pci: Attempt bus/slot reset on releaseAlex Williamson
Each time a device is released, mark whether a local reset was successful or whether a bus/slot reset is needed. If a reset is needed and all of the affected devices are bound to vfio-pci and unused, allow the reset. This is most useful when the userspace driver is killed and releases all the devices in an unclean state, such as when a QEMU VM quits. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-07vfio-pci: Use mutex around open, release, and removeAlex Williamson
Serializing open/release allows us to fix a refcnt error if we fail to enable the device and lets us prevent devices from being unbound or opened, giving us an opportunity to do bus resets on release. No restriction added to serialize binding devices to vfio-pci while the mutex is held though. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-07vfio-pci: Release devices with BusMaster disabledAlex Williamson
Our current open/release path looks like this: vfio_pci_open vfio_pci_enable pci_enable_device pci_save_state pci_store_saved_state vfio_pci_release vfio_pci_disable pci_disable_device pci_restore_state pci_enable_device() doesn't modify PCI_COMMAND_MASTER, so if a device comes to us with it enabled, it persists through the open and gets stored as part of the device saved state. We then restore that saved state when released, which can allow the device to attempt to continue to do DMA. When the group is disconnected from the domain, this will get caught by the IOMMU, but if there are other devices in the group, the device may continue running and interfere with the user. Even in the former case, IOMMUs don't necessarily behave well and a stream of blocked DMA can result in unpleasant behavior on the host. Explicitly disable Bus Master as we're enabling the device and slightly re-work release to make sure that pci_disable_device() is the last thing that touches the device. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-05drivers/vfio: EEH support for VFIO PCI deviceGavin Shan
The patch adds new IOCTL commands for sPAPR VFIO container device to support EEH functionality for PCI devices, which have been passed through from host to somebody else via VFIO. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexander Graf <agraf@suse.de> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-30drivers/vfio/pci: Fix wrong MSI interrupt countGavin Shan
According PCI local bus specification, the register of Message Control for MSI (offset: 2, length: 2) has bit#0 to enable or disable MSI logic and it shouldn't be part contributing to the calculation of MSI interrupt count. The patch fixes the issue. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-05-30vfio/pci: Fix unchecked return valueAlex Williamson
There's nothing we can do different if pci_load_and_free_saved_state() fails, other than maybe print some log message, but the actual re-load of the state is an unnecessary step here since we've only just saved it. We can cleanup a coverity warning and eliminate the unnecessary step by freeing the state ourselves. Detected by Coverity: CID 753101 Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-01-24Merge tag 'vfio-v3.14-rc1' of git://github.com/awilliam/linux-vfioLinus Torvalds
Pull vfio update from Alex Williamson: - convert to misc driver to support module auto loading - remove unnecessary and dangerous use of device_lock * tag 'vfio-v3.14-rc1' of git://github.com/awilliam/linux-vfio: vfio-pci: Don't use device_lock around AER interrupt setup vfio: Convert control interface to misc driver misc: Reserve minor for VFIO
2014-01-15vfio-pci: Use pci "try" reset interfaceAlex Williamson
PCI resets will attempt to take the device_lock for any device to be reset. This is a problem if that lock is already held, for instance in the device remove path. It's not sufficient to simply kill the user process or skip the reset if called after .remove as a race could result in the same deadlock. Instead, we handle all resets as "best effort" using the PCI "try" reset interfaces. This prevents the user from being able to induce a deadlock by triggering a reset. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-01-14vfio-pci: Don't use device_lock around AER interrupt setupAlex Williamson
device_lock is much too prone to lockups. For instance if we have a pending .remove then device_lock is already held. If userspace attempts to modify AER signaling after that point, a deadlock occurs. eventfd setup/teardown is already protected in vfio with the igate mutex. AER is not a high performance interrupt, so we can also use the same mutex to protect signaling versus setup races. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-09-04vfio-pci: PCI hot reset interfaceAlex Williamson
The current VFIO_DEVICE_RESET interface only maps to PCI use cases where we can isolate the reset to the individual PCI function. This means the device must support FLR (PCIe or AF), PM reset on D3hot->D0 transition, device specific reset, or be a singleton device on a bus for a secondary bus reset. FLR does not have widespread support, PM reset is not very reliable, and bus topology is dictated by the system and device design. We need to provide a means for a user to induce a bus reset in cases where the existing mechanisms are not available or not reliable. This device specific extension to VFIO provides the user with this ability. Two new ioctls are introduced: - VFIO_DEVICE_PCI_GET_HOT_RESET_INFO - VFIO_DEVICE_PCI_HOT_RESET The first provides the user with information about the extent of devices affected by a hot reset. This is essentially a list of devices and the IOMMU groups they belong to. The user may then initiate a hot reset by calling the second ioctl. We must be careful that the user has ownership of all the affected devices found via the first ioctl, so the second ioctl takes a list of file descriptors for the VFIO groups affected by the reset. Each group must have IOMMU protection established for the ioctl to succeed. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-07-24vfio-pci: Avoid deadlock on removeAlex Williamson
If an attempt is made to unbind a device from vfio-pci while that device is in use, the request is blocked until the device becomes unused. Unfortunately, that unbind path still grabs the device_lock, which certain things like __pci_reset_function() also want to take. This means we need to try to acquire the locks ourselves and use the pre-locked version, __pci_reset_function_locked(). Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-06-29vfio: remap_pfn_range() sets all those flags...Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-05-02Merge tag 'vfio-for-v3.10' of git://github.com/awilliam/linux-vfioLinus Torvalds
Pull vfio updates from Alex Williamson: "Changes include extension to support PCI AER notification to userspace, byte granularity of PCI config space and access to unarchitected PCI config space, better protection around IOMMU driver accesses, default file mode fix, and a few misc cleanups." * tag 'vfio-for-v3.10' of git://github.com/awilliam/linux-vfio: vfio: Set container device mode vfio: Use down_reads to protect iommu disconnects vfio: Convert container->group_lock to rwsem PCI/VFIO: use pcie_flags_reg instead of access PCI-E Capabilities Register vfio-pci: Enable raw access to unassigned config space vfio-pci: Use byte granularity in config map vfio: make local function vfio_pci_intx_unmask_handler() static VFIO-AER: Vfio-pci driver changes for supporting AER VFIO: Wrapper for getting reference to vfio_device
2013-04-29Merge tag 'pci-v3.10-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "PCI changes for the v3.10 merge window: PCI device hotplug - Remove ACPI PCI subdrivers (Jiang Liu, Myron Stowe) - Make acpiphp builtin only, not modular (Jiang Liu) - Add acpiphp mutual exclusion (Jiang Liu) Power management - Skip "PME enabled/disabled" messages when not supported (Rafael Wysocki) - Fix fallback to PCI_D0 (Rafael Wysocki) Miscellaneous - Factor quirk_io_region (Yinghai Lu) - Cache MSI capability offsets & cleanup (Gavin Shan, Bjorn Helgaas) - Clean up EISA resource initialization and logging (Bjorn Helgaas) - Fix prototype warnings (Andy Shevchenko, Bjorn Helgaas) - MIPS: Initialize of_node before scanning bus (Gabor Juhos) - Fix pcibios_get_phb_of_node() declaration "weak" annotation (Gabor Juhos) - Add MSI INTX_DISABLE quirks for AR8161/AR8162/etc (Xiong Huang) - Fix aer_inject return values (Prarit Bhargava) - Remove PME/ACPI dependency (Andrew Murray) - Use shared PCI_BUS_NUM() and PCI_DEVID() (Shuah Khan)" * tag 'pci-v3.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (63 commits) vfio-pci: Use cached MSI/MSI-X capabilities vfio-pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK PCI: Remove "extern" from function declarations PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK PCI: Drop msi_mask_reg() and remove drivers/pci/msi.h PCI: Use msix_table_size() directly, drop multi_msix_capable() PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macros PCI: Drop is_64bit_address() and is_mask_bit_support() macros PCI: Drop msi_data_reg() macro PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macros PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directly PCI: Use cached MSI/MSI-X offsets from dev, not from msi_desc PCI: Clean up MSI/MSI-X capability #defines PCI: Use cached MSI-X cap while enabling MSI-X PCI: Use cached MSI cap while enabling MSI interrupts PCI: Remove MSI/MSI-X cap check in pci_msi_check_device() PCI: Cache MSI/MSI-X capability offsets in struct pci_dev PCI: Use u8, not int, for PM capability offset [SCSI] megaraid_sas: Use correct #define for MSI-X capability PCI: Remove "extern" from function declarations ...