summaryrefslogtreecommitdiff
path: root/drivers/misc/cxl/cxl.h
AgeCommit message (Collapse)Author
2016-05-20Merge tag 'powerpc-4.7-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights: - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V - Live patching support for ppc64le (also merged via livepatching.git) Various cleanups & minor fixes from: - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin Rothberg, Vipin K Parashar. General: - Update LMB associativity index during DLPAR add/remove from Nathan Fontenot - Fix branching to OOL handlers in relocatable kernel from Hari Bathini - Add support for userspace Power9 copy/paste from Chris Smart - Always use STRICT_MM_TYPECHECKS from Michael Ellerman - Add mask of possible MMU features from Michael Ellerman PCI: - Enable pass through of NVLink to guests from Alexey Kardashevskiy - Cleanups in preparation for powernv PCI hotplug from Gavin Shan - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G Piccoli - Remove the dependency on EEH struct in DDW mechanism from Guilherme G Piccoli selftests: - Test cp_abort during context switch from Chris Smart - Add several tests for transactional memory support from Rashmica Gupta perf: - Add support for sampling interrupt register state from Anju T - Add support for unwinding perf-stackdump from Chandan Kumar cxl: - Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud - Allow initialization on timebase sync failures from Frederic Barrat - Increase timeout for detection of AFU mmio hang from Frederic Barrat - Handle num_of_processes larger than can fit in the SPA from Ian Munsie - Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie - Add kernel API to allow a context to operate with relocate disabled from Ian Munsie - Check periodically the coherent platform function's state from Christophe Lombard Freescale: - Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum workaround, and a kconfig dependency fix." * tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (192 commits) powerpc/86xx: Fix PCI interrupt map definition powerpc/86xx: Move pci1 definition to the include file powerpc/fsl: Fix build of the dtb embedded kernel images powerpc/fsl: Fix rcpm compatible string powerpc/fsl: Remove FSL_SOC dependency from FSL_LBC powerpc/fsl-pci: Add a workaround for PCI 5 errata powerpc/fsl: Fix SPI compatible on t208xrdb and t1040rdb powerpc/powernv/npu: Add PE to PHB's list powerpc/powernv: Fix insufficient memory allocation powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner() powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover() powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover() powerpc/eeh: Don't report error in eeh_pe_reset_and_recover() Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()" powerpc/powernv/npu: Enable NVLink pass through powerpc/powernv/npu: Rework TCE Kill handling powerpc/powernv/npu: Add set/unset window helpers powerpc/powernv/ioda2: Export debug helper pe_level_printk() ...
2016-05-11cxl: Check periodically the coherent platform function's stateChristophe Lombard
In the PowerVM environment, the PHYP CoherentAccel component manages the state of the Coherent Accelerator Processor Interface adapter and virtualizes CAPI resources, handles CAPP, PSL, PSL Slice errors - and interrupts - and provides a new set of hcalls for the OS APIs to utilize Accelerator Function Unit (AFU). During the course of operation, a coherent platform function can encounter errors. Some possible reason for errors are: • Hardware recoverable and unrecoverable errors • Transient and over-threshold correctable errors PHYP implements its own state model for the coherent platform function. The state of the AFU is available through a hcall. The current implementation of the cxl driver, for the PowerVM environment, checks this state of the AFU only when an action is requested - open a device, ioctl command, memory map, attach/detach a process - from an external driver - cxlflash, libcxl. If an error is detected the cxl driver handles the error according the content of the Power Architecture Platform Requirements document. But in case of low-level troubles (or error injection), the PHYP component may reset the card and change the AFU state. The PHYP interface doesn't provide any way to be notified when that happens thus implies that the cxl driver: • cannot handle immediatly the state change of the AFU. • cannot notify other drivers (cxlflash, ...) The purpose of this patch is to wake up the cpu periodically to check the current state of each AFU and to see if we need to enter an error recovery path. Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11cxl: Add kernel API to allow a context to operate with relocate disabledIan Munsie
cxl devices typically access memory using an MMU in much the same way as the CPU, and each context includes a state register much like the MSR in the CPU. Like the CPU, the state register includes a bit to enable relocation, which we currently always enable. In some cases, it may be desirable to allow a device to access memory using real addresses instead of effective addresses, so this adds a new API, cxl_set_translation_mode, that can be used to disable relocation on a given kernel context. This can allow for the creation of a special privileged context that the device can use if it needs relocation disabled, and can use regular contexts at times when it needs relocation enabled. This interface is only available to users of the kernel API for obvious reasons, and will never be supported in a virtualised environment. This will be used by the upcoming cxl support in the mlx5 driver. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11cxl: Remove duplicate #definesIan Munsie
These defines are not used, but other equivalent definitions (CXL_SPA_SW_CMD_*) are used. Remove the unused defines. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-27cxl: Poll for outstanding IRQs when detaching a contextMichael Neuling
When detaching contexts, we may still have interrupts in the system which are yet to be delivered to any CPU and be acked in the PSL. This can result in a subsequent unrelated process getting an spurious IRQ or an interrupt for a non-existent context. This polls the PSL to ensure that the PSL is clear of IRQs for the detached context, before removing the context from the idr. Signed-off-by: Michael Neuling <mikey@neuling.org> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-22cxl: Allow initialization on timebase sync failuresFrederic Barrat
Failure to synchronize the PSL timebase currently prevents the initialization of the cxl card, thus rendering the card useless. This is too extreme for a feature which is rarely used, if at all. No hardware AFUs or software is currently using PSL timebase. This patch still tries to synchronize the PSL timebase when the card is initialized, but ignores the error if it can't. Instead, it reports a status via /sys. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Ignore probes for virtual afu pci devicesVaibhav Jain
Add a check at the beginning of cxl_probe function to ignore virtual pci devices created for each afu registered. This fixes the the errors messages logged about missing CXL vsec, when cxl probe is unable to find necessary vsec entries in device pci config space. The error message logged are of the form : cxl-pci 0004:00:00.0: ABORTING: CXL VSEC not found! cxl-pci 0004:00:00.0: cxl_init_adapter failed: -19 Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: fbarrat@linux.vnet.ibm.com Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Adapter failure handlingChristophe Lombard
Check the AFU state whenever an API is called. The hypervisor may issue a reset of the adapter when it detects a fault. When it happens, it launches an error recovery which will either move the AFU to a permanent failure state, or in the disabled state. If the AFU is found to be disabled, detach all existing contexts from it before issuing a AFU reset to re-enable it. Before detaching contexts, notify any kernel driver through the EEH callbacks of the AFU pci device. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Support the cxl kernel API from a guestFrederic Barrat
Like on bare-metal, the cxl driver creates a virtual PHB and a pci device for the AFU. The configuration space of the device is mapped to the configuration record of the AFU. Reuse the code defined in afu_cr_read8|16|32() when reading the configuration space of the AFU device. Even though the (virtual) AFU device is a pci device, the adapter is not. So a driver using the cxl kernel API cannot read the VPD of the adapter through the usual PCI interface. Therefore, we add a call to the cxl kernel API: ssize_t cxl_read_adapter_vpd(struct pci_dev *dev, void *buf, size_t count); Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Support to flash a new image on the adapter from a guestChristophe Lombard
The new flash.c file contains the logic to flash a new image on the adapter, through a hcall. It is an iterative process, with chunks of data of 1M at a time. There are also 2 phases: write and verify. The flash operation itself is driven from a user-land tool. Once flashing is successful, an rtas call is made to update the device tree with the new properties values for the adapter and the AFU(s) Add a new char device for the adapter, so that the flash tool can access the card, even if there is no valid AFU on it. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: sysfs support for guestsChristophe Lombard
Filter out a few adapter parameters which don't make sense in a guest. Document the changes. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Add guest-specific codeChristophe Lombard
The new of.c file contains code to parse the device tree to find out about cxl adapters and AFUs. guest.c implements the guest-specific callbacks for the backend API. The process element ID is not known until the context is attached, so we have to separate the context ID assigned by the cxl driver from the process element ID visible to the user applications. In bare-metal, the 2 IDs match. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> [mpe: Fix SMP=n build, fix PSERIES=n build, minor whitespace fixes] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Separate bare-metal fields in adapter and AFU data structuresChristophe Lombard
Introduce sub-structures containing the bare-metal specific fields in the structures describing the adapter (struct cxl) and AFU (struct cxl_afu). Update all their references. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: New hcalls to support cxl adaptersChristophe Lombard
The hypervisor calls provide an interface with a coherent platform facility and function. It matches version 0.16 of the 'PAPR changes' document. The following hcalls are supported: H_ATTACH_CA_PROCESS Attach a process element to a coherent platform function. H_DETACH_CA_PROCESS Detach a process element from a coherent platform function. H_CONTROL_CA_FUNCTION Allow the partition to manipulate or query certain coherent platform function behaviors. H_COLLECT_CA_INT_INFO Collect interrupt info about a coherent. platform function after an interrupt occurred H_CONTROL_CA_FAULTS Control the operation of a coherent platform function after a fault occurs. H_DOWNLOAD_CA_FACILITY Support for downloading a base adapter image to the coherent platform facility, and for validating the entire image after the download. H_CONTROL_CA_FACILITY Allow the partition to manipulate or query certain coherent platform facility behaviors. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Update cxl_irq() prototypeFrederic Barrat
The context parameter when calling cxl_irq() should be strongly typed. Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Isolate a few bare-metal-specific callsFrederic Barrat
A few functions are mostly common between bare-metal and guest and just need minor tuning. To avoid crowding the backend API, introduce a few 'if' based on the CPU being in HV mode. Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Rename some bare-metal specific functionsFrederic Barrat
Rename a few functions, changing the 'cxl_' prefix to either 'cxl_pci_' or 'cxl_native_', to make clear that the implementation is bare-metal specific. Those functions will have an equivalent implementation for a guest in a later patch. Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Introduce implementation-specific APIFrederic Barrat
The backend API (in cxl.h) lists some low-level functions whose implementation is different on bare-metal and in a guest. Each environment implements its own functions, and the common code uses them through function pointers, defined in cxl_backend_ops Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Move bare-metal specific code to specialized filesFrederic Barrat
Move a few functions around to better separate code specific to bare-metal environment from code which will be commonly used between guest and bare-metal. Code specific to bare-metal is meant to be in native.c or pci.c only. It's basically anything which touches the card p1 registers, some p2 registers not needed from a guest and the PCI interface. Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Move common code away from bare-metal-specific filesChristophe Lombard
Move around some functions which will be accessed from the bare-metal and guest environments. Code in native.c and pci.c is meant to be bare-metal specific. Other files contain code which may be shared with guests. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-05cxl: Fix DSI misses when the context owning task exitsVaibhav Jain
Presently when a user-space process issues CXL_IOCTL_START_WORK ioctl we store the pid of the current task_struct and use it to get pointer to the mm_struct of the process, while processing page or segment faults from the capi card. However this causes issues when the thread that had originally issued the start-work ioctl exits in which case the stored pid is no more valid and the cxl driver is unable to handle faults as the mm_struct corresponding to process is no more accessible. This patch fixes this issue by using the mm_struct of the next alive task in the thread group. This is done by iterating over all the tasks in the thread group starting from thread group leader and calling get_task_mm on each one of them. When a valid mm_struct is obtained the pid of the associated task is stored in the context replacing the exiting one for handling future faults. The patch introduces a new function named get_mem_context that checks if the current task pointed to by ctx->pid is dead? If yes it performs the steps described above. Also a new variable cxl_context.glpid is introduced which stores the pid of the thread group leader associated with the context owning task. Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reported-by: Frank Haverkamp <HAVERKAM@de.ibm.com> Suggested-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-24cxl: Fix possible idr warning when contexts are releasedVaibhav Jain
An idr warning is reported when a context is release after the capi card is unbound from the cxl driver via sysfs. Below are the steps to reproduce: 1. Create multiple afu contexts in an user-space application using libcxl. 2. Unbind capi card from cxl using command of form echo <capi-card-pci-addr> > /sys/bus/pci/drivers/cxl-pci/unbind 3. Exit/kill the application owning afu contexts. After above steps a warning message is usually seen in the kernel logs of the form "idr_remove called for id=<context-id> which is not allocated." This is caused by the function cxl_release_afu which destroys the contexts_idr table. So when a context is release no entry for context pe is found in the contexts_idr table and idr code prints this warning. This patch fixes this issue by increasing & decreasing the ref-count on the afu device when a context is initialized or when its freed respectively. This prevents the afu from being released until all the afu contexts have been released. The patch introduces two new functions namely cxl_afu_get/put that manage the ref-count on the afu device. Also the patch removes code inside cxl_dev_context_init that increases ref on the afu device as its guaranteed to be alive during this function. Reported-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-01cxl: fix leak of IRQ names in cxl_free_afu_irqs()Andrew Donnellan
cxl_free_afu_irqs() doesn't free IRQ names when it releases an AFU's IRQ ranges. The userspace API equivalent in afu_release_irqs() calls afu_irq_name_free() to release the IRQ names. Call afu_irq_name_free() in cxl_free_afu_irqs() to release the IRQ names. Make afu_irq_name_free() non-static to allow this. Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Fixes: 6f7f0b3df6d4 ("cxl: Add AFU virtual PHB and kernel API") Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-30cxl: Set up and enable PSL TimebasePhilippe Bergheaud
This patch configures the PSL Timebase function and enables it, after the CAPP has been initialized by OPAL. Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-30cxl: Fix force unmapping mmaps of contexts allocated through the kernel apiIan Munsie
The cxl user api uses the address_space associated with the file when we need to force unmap all cxl mmap regions (e.g. on eeh, driver detach, etc). Currently, contexts allocated through the kernel api do not do this and instead skip the mmap invalidation, potentially allowing them to poke at the hardware after such an event, which may cause all sorts of trouble. This patch allocates an address_space for cxl contexts allocated through the kernel api so that the same invalidate path will for these contexts as well. We don't use the anonymous inode's address_space, as doing so could invalidate any mmaps of completely unrelated drivers using anonymous file descriptors. This patch also introduces a kernelapi flag, so we know when freeing the context if the address_space was allocated by us and needs to be freed. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-18cxl: Add alternate MMIO error handlingIan Munsie
userspace programs using cxl currently have to use two strategies for dealing with MMIO errors simultaneously. They have to check every read for a return of all Fs in case the adapter has gone away and the kernel has not yet noticed, and they have to deal with SIGBUS in case the kernel has already noticed, invalidated the mapping and marked the context as failed. In order to simplify things, this patch adds an alternative approach where the kernel will return a page filled with Fs instead of delivering a SIGBUS. This allows userspace to only need to deal with one of these two error paths, and is intended for use in libraries that use cxl transparently and may not be able to safely install a signal handler. This approach will only work if certain constraints are met. Namely, if the application is both reading and writing to an address in the problem state area it cannot assume that a non-FF read is OK, as it may just be reading out a value it has previously written. Further - since only one page is used per context a write to a given offset would be visible when reading the same offset from a different page in the mapping (this only applies within a single context, not between contexts). An application could deal with this by e.g. making sure it also reads from a read-only offset after any reads to a read/write offset. Due to these constraints, this functionality must be explicitly requested by userspace when starting the context by passing in the CXL_START_WORK_ERR_FF flag. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14cxl: EEH supportDaniel Axtens
EEH (Enhanced Error Handling) allows a driver to recover from the temporary failure of an attached PCI card. Enable basic CXL support for EEH. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14cxl: Allow the kernel to trust that an image won't change on PERST.Daniel Axtens
Provide a kernel API and a sysfs entry which allow a user to specify that when a card is PERSTed, it's image will stay the same, allowing it to participate in EEH. cxl_reset is used to reflash the card. In that case, we cannot safely assert that the image will not change. Therefore, disallow cxl_reset if the flag is set. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14cxl: Allocate and release the SPA with the AFUDaniel Axtens
Previously the SPA was allocated and freed upon entering and leaving AFU-directed mode. This causes some issues for error recovery - contexts hold a pointer inside the SPA, and they may persist after the AFU has been detached. We would ideally like to allocate the SPA when the AFU is allocated, and release it until the AFU is released. However, we don't know how big the SPA needs to be until we read the AFU descriptor. Therefore, restructure the code: - Allocate the SPA only once, on the first attach. - Release the SPA only when the entire AFU is being released (not detached). Guard the release with a NULL check, so we don't free if it was never allocated (e.g. dedicated mode) Acked-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14cxl: Drop commands if the PCI channel is not in normal stateDaniel Axtens
If the PCI channel has gone down, don't attempt to poke the hardware. We need to guard every time cxl_whatever_(read|write) is called. This is because a call to those functions will dereference an offset into an mmio register, and the mmio mappings get invalidated in the EEH teardown. Check in the read/write functions in the header. We give them the same semantics as usual PCI operations: - a write to a channel that is down is ignored. - a read from a channel that is down returns all fs. Also, we try to access the MMIO space of a vPHB device as part of the PCI disable path. Because that's a read that bypasses most of our usual checks, we handle it explicitly. As far as user visible warnings go: - Check link state in file ops, return -EIO if down. - Be reasonably quiet if there's an error in a teardown path, or when we already know the hardware is going down. - Throw a big WARN if someone tries to start a CXL operation while the card is down. This gives a useful stacktrace for debugging whatever is doing that. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14cxl: Convert MMIO read/write macros to inline functionsDaniel Axtens
We're about to make these more complex, so make them functions first. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: Add AFU virtual PHB and kernel APIMichael Neuling
This patch does two things. Firstly it presents the Accelerator Function Unit (AFUs) behind the POWER Service Layer (PSL) as PCI devices on a virtual PCI Host Bridge (vPHB). This in in addition to the PSL being a PCI device itself. As part of the Coherent Accelerator Interface Architecture (CAIA) AFUs can provide an AFU configuration. This AFU configuration recored is architected to be the same as a PCI config space. This patch sets discovers the AFU configuration records, provides AFU config space read/write functions to these configuration records. It then enumerates the PCI bus. It also hooks in PCI ops where appropriate. It also destroys the vPHB when the physical card is removed. Secondly, it add an in kernel API for AFU to use CXL. AFUs must present a driver that firstly binds as a PCI device. This PCI device can then be using to do CXL specific operations (that can't sit in the PCI ops) using this API. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: Export file ops for use by APIMichael Neuling
The cxl kernel API will allow drivers other than cxl to export a file descriptor which has the same userspace API. These file descriptors will be able to be used against libcxl. This exports those file ops for use by other drivers. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: Move include file cxl.h -> cxl-base.hMichael Neuling
This moves the current include file from cxl.h -> cxl-base.h. This current include file is used only to pass information between the base driver that needs to be built into the kernel and the cxl module. This is to make way for a new include/misc/cxl.h which will contain just the kernel API for other driver to use Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: Split afu_register_irqs() functionMichael Neuling
Split the afu_register_irqs() function so that different parts can be useful elsewhere. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: Export some symbolsMichael Neuling
Export some symbols which will soon be used elsewhere in this driver. Now they are global we rename them so to avoid collisions. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: cxl_afu_reset() -> __cxl_afu_reset()Michael Neuling
Rename cxl_afu_reset() to __cxl_afu_reset() to we can reuse this function name in the API. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: Rework detach context functionsMichael Neuling
Rework __detach_context() and cxl_context_detach() so we can reuse them in the kernel API. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: Add cookie parameter to afu_release_irqs()Michael Neuling
Add cookie parameter to afu_release_irqs() so that we can pass in a different cookie than the context structure. This will be useful for other kernel drivers that want to call this but get their own cookie back in the interrupt handler. Update all existing call sites. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: Use call_rcu to reduce latency when releasing the afu fdIan Munsie
The afu fd release path was identified as a significant bottleneck in the overall performance of cxl. While an optimal AFU design would minimise the need to close & reopen the AFU fd, it is not always practical to avoid. The bottleneck seems to be down to the call to synchronize_rcu(), which will block until every other thread is guaranteed to be out of an RCU critical section. Replace it with call_rcu() to free the context structures later so we can return to the application sooner. This reduces the time spent in the fd release path from 13356 usec to 13.3 usec - about a 100x speed up. Reported-by: Fei K Chen <uchen@cn.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: Export AFU error buffer via sysfsVaibhav Jain
Export the "AFU Error Buffer" via sysfs attribute (afu_err_buf). AFU error buffer is used by the AFU to report application specific errors. The contents of this buffer are AFU specific and are intended to be interpreted by the application interacting with the afu. Suggested-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: Implement an ioctl to fetch afu card-id, offset-id and modeVaibhav Jain
Given a file descriptor on an afu device, libcxl currently uses the major/minor number obtained from fstat on the fd to construct path to the afu's sysfs directory. However it is possible that rather than using one of the device in /dev/cxl, a kernel driver creates its own device which export generic cxl interface to the userspace. This causes problems with libcxl as it tries to use a wrong major/minor number to construct the sysfs path and fail. So this patch introduces a new ioctl called CXL_IOCTL_GET_AFU_ID on the afu file descriptor to fetch the cxl_afu_id struct that holds the card/offset-id and mode information. These info is then used by libcxl to construct the correct path to the afu sysfs directory. Testing: - Build against pseries be/le configs - Testing with corresponding libcxl changes to verify that it constructs right sysfs path to the afu. Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-02-06cxl: Export optional AFU configuration record in sysfsIan Munsie
An AFU may optionally contain one or more PCIe like configuration records, which can be used to identify the AFU. This patch adds support for exposing the raw config space and the vendor, device and class code under sysfs. These will appear in a subdirectory of the AFU device corresponding with the configuration record number, e.g. cat /sys/class/cxl/afu0.0/cr0/vendor 0x1014 cat /sys/class/cxl/afu0.0/cr0/device 0x4350 cat /sys/class/cxl/afu0.0/cr0/class 0x120000 hexdump -C /sys/class/cxl/afu0.0/cr0/config 00000000 14 10 50 43 00 00 00 00 06 00 00 12 00 00 00 00 |..PC............| 00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000100 These files behave in much the same way as the equivalent files for PCI devices, with one exception being that the config file is currently read-only and restricted to the root user. It is not necessarily required to be this strict, but we currently do not have a compelling use-case to make it writable and/or world-readable, so I erred on the side of being restrictive. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22cxl: Add ability to reset the cardRyan Grimm
Adds reset to sysfs which will PERST the card. If load_image_on_perst is set to "user" or "factory", the PERST will cause that image to be loaded. load_image_on_perst is set to "user" for production. "none" could be used for debugging. The PSL trace arrays are preserved which then can be read through debugfs. PERST also triggers CAPP recovery. An HMI comes in, which is handled by EEH. EEH unbinds the driver, calls into Sapphire to reinitialize the PHB, then rebinds the driver. Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22cxl: Use image state defaults for reloading FPGARyan Grimm
Select defaults such that a PERST causes flash image reload. Select which image based on what the card is set up to load. CXL_VSEC_PERST_LOADS_IMAGE selects whether PERST assertion causes flash image load. CXL_VSEC_PERST_SELECT_USER selects which image is loaded on the next PERST. cxl_update_image_control writes these bits into the VSEC. Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-29cxl: Disable AFU debug flagIan Munsie
Upon inspection of the implementation specific registers, it was discovered that the high bit of the implementation specific RXCTL register was enabled, which enables the DEADB00F debug feature. The debug feature causes MMIO reads to a disabled AFU to respond with 0xDEADB00F instead of all Fs. In general this should not be visible as the kernel will only allow MMIO access to enabled AFUs, but there may be some circumstances where an AFU may become disabled while it is use. One such case would be an AFU designed to only be used in the dedicated process mode and to disable itself after it has completed it's work (however even in that case the effects of this debug flag would be limited as the userspace application must have completed any required MMIO accesses before the AFU disables itself with or without the flag). This patch removes the debug flag and replaces the magic value programmed into this register with a preprocessor define so it is clearer what the rest of this initialisation does. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-12cxl: Unmap MMIO regions when detaching a contextIan Munsie
If we need to force detach a context (e.g. due to EEH or simply force unbinding the driver) we should prevent the userspace contexts from being able to access the Problem State Area MMIO region further, which they may have mapped with mmap(). This patch unmaps any mapped MMIO regions when detaching a userspace context. Cc: stable@vger.kernel.org Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-12cxl: Change contexts_lock to a mutex to fix sleep while atomic bugIan Munsie
We had a known sleep while atomic bug if a CXL device was forcefully unbound while it was in use. This could occur as a result of EEH, or manually induced with something like this while the device was in use: echo 0000:01:00.0 > /sys/bus/pci/drivers/cxl-pci/unbind The issue was that in this code path we iterated over each context and forcefully detached it with the contexts_lock spin lock held, however the detach also needed to take the spu_mutex, and call schedule. This patch changes the contexts_lock to a mutex so that we are not in atomic context while doing the detach, thereby avoiding the sleep while atomic. Also delete the related TODO comment, which suggested an alternate solution which turned out to not be workable. Cc: stable@vger.kernel.org Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-18cxl: Name interrupts in /proc/interruptMichael Neuling
Currently all interrupts generated by cxl are named "cxl". This is not very informative as we can't distinguish between cards, AFUs, error interrupts, user contexts and user interrupts numbers. Being able to distinguish them is useful for setting affinity. This patch gives each of these names in /proc/interrupts. A two card CAPI system, with afu0.0 having 2 active contexts each with 4 user IRQs each, will now look like this: % grep cxl /proc/interrupts 444: 0 OPAL ICS 141312 Level cxl-card1-err 445: 0 OPAL ICS 141313 Level cxl-afu1.0-err 446: 0 OPAL ICS 141314 Level cxl-afu1.0 462: 0 OPAL ICS 2052 Level cxl-afu0.0-pe0-1 463: 75517 OPAL ICS 2053 Level cxl-afu0.0-pe0-2 468: 0 OPAL ICS 2054 Level cxl-afu0.0-pe0-3 469: 0 OPAL ICS 2055 Level cxl-afu0.0-pe0-4 470: 0 OPAL ICS 2056 Level cxl-afu0.0-pe1-1 471: 75506 OPAL ICS 2057 Level cxl-afu0.0-pe1-2 472: 0 OPAL ICS 2058 Level cxl-afu0.0-pe1-3 473: 0 OPAL ICS 2059 Level cxl-afu0.0-pe1-4 502: 1066 OPAL ICS 2050 Level cxl-afu0.0 514: 0 OPAL ICS 2048 Level cxl-card0-err 515: 0 OPAL ICS 2049 Level cxl-afu0.0-err Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-18cxl: Return error to PSL if IRQ demultiplexing fails & print clearer warningIan Munsie
If an AFU has a hardware bug that causes it to acknowledge a context terminate or remove while that context has outstanding transactions, it is possible for the kernel to receive an interrupt for that context after we have removed it from the context list. The kernel will not be able to demultiplex the interrupt (or worse - if we have already reallocated the process handle we could mis-attribute it to the new context), and printed a big scary warning. It did not acknowledge the interrupt, which would effectively halt further translation fault processing on the PSL. This patch makes the warning clearer about the likely cause of the issue (i.e. hardware bug) to make it obvious to future AFU designers of what needs to be fixed. It also prints out the process handle which can then be matched up with hardware and software traces for debugging. It also acknowledges the interrupt to the PSL with either an address error or acknowledge, so that the PSL can continue with other translations. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>