summaryrefslogtreecommitdiff
path: root/drivers/s390/net/ism_drv.c
AgeCommit message (Collapse)Author
6 dayss390/ism: fix concurrency management in ism_cmd()Halil Pasic
The s390x ISM device data sheet clearly states that only one request-response sequence is allowable per ISM function at any point in time. Unfortunately as of today the s390/ism driver in Linux does not honor that requirement. This patch aims to rectify that. This problem was discovered based on Aliaksei's bug report which states that for certain workloads the ISM functions end up entering error state (with PEC 2 as seen from the logs) after a while and as a consequence connections handled by the respective function break, and for future connection requests the ISM device is not considered -- given it is in a dysfunctional state. During further debugging PEC 3A was observed as well. A kernel message like [ 1211.244319] zpci: 061a:00:00.0: Event 0x2 reports an error for PCI function 0x61a is a reliable indicator of the stated function entering error state with PEC 2. Let me also point out that a kernel message like [ 1211.244325] zpci: 061a:00:00.0: The ism driver bound to the device does not support error recovery is a reliable indicator that the ISM function won't be auto-recovered because the ISM driver currently lacks support for it. On a technical level, without this synchronization, commands (inputs to the FW) may be partially or fully overwritten (corrupted) by another CPU trying to issue commands on the same function. There is hard evidence that this can lead to DMB token values being used as DMB IOVAs, leading to PEC 2 PCI events indicating invalid DMA. But this is only one of the failure modes imaginable. In theory even completely losing one command and executing another one twice and then trying to interpret the outputs as if the command we intended to execute was actually executed and not the other one is also possible. Frankly, I don't feel confident about providing an exhaustive list of possible consequences. Fixes: 684b89bc39ce ("s390/ism: add device driver for internal shared memory") Reported-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com> Tested-by: Mahanta Jambigi <mjambigi@linux.ibm.com> Tested-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250722161817.1298473-1-wintera@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-21s390: ism: Pass string literal as format argument of dev_set_name()Simon Horman
GCC 14.2.0 reports that passing a non-string literal as the format argument of dev_set_name() is potentially insecure. drivers/s390/net/ism_drv.c: In function 'ism_probe': drivers/s390/net/ism_drv.c:615:2: warning: format not a string literal and no format arguments [-Wformat-security] 615 | dev_set_name(&ism->dev, dev_name(&pdev->dev)); | ^~~~~~~~~~~~ It seems to me that as pdev is a PCIE device then the dev_name call above should always return the device's BDF, e.g. 00:12.0. That this should not contain format escape sequences. And thus the current usage is safe. But, it seems better to be safe than sorry. And, as a bonus, compiler output becomes less verbose by addressing this issue. Compile tested only. No functional change intended. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250417-ism-str-fmt-v1-1-9818b029874d@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-28Merge tag 'pci-v6.15-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Enable Configuration RRS SV, which makes device readiness visible, early instead of during child bus scanning (Bjorn Helgaas) - Log debug messages about reset methods being used (Bjorn Helgaas) - Avoid reset when it has been disabled via sysfs (Nishanth Aravamudan) - Add common pci-ep-bus.yaml schema for exporting several peripherals of a single PCI function via devicetree (Andrea della Porta) - Create DT nodes for PCI host bridges to enable loading device tree overlays to create platform devices for PCI devices that have several features that require multiple drivers (Herve Codina) Resource management: - Enlarge devres table[] to accommodate bridge windows, ROM, IOV BARs, etc., and validate BAR index in devres interfaces (Philipp Stanner) - Fix typo that repeatedly distributed resources to a bridge instead of iterating over subordinate bridges, which resulted in too little space to assign some BARs (Kai-Heng Feng) - Relax bridge window tail sizing for optional resources, e.g., IOV BARs, to avoid failures when removing and re-adding devices (Ilpo Järvinen) - Allow drivers to enable devices even if we haven't assigned optional IOV resources to them (Ilpo Järvinen) - Rework handling of optional resources (IOV BARs, ROMs) to reduce failures if we can't allocate them (Ilpo Järvinen) - Fix a NULL dereference in the SR-IOV VF creation error path (Shay Drory) - Fix s390 mmio_read/write syscalls, which didn't cause page faults in some cases, which broke vfio-pci lazy mapping on first access (Niklas Schnelle) - Add pdev->non_mappable_bars to replace CONFIG_VFIO_PCI_MMAP, which was disabled only for s390 (Niklas Schnelle) - Support mmap of PCI resources on s390 except for ISM devices (Niklas Schnelle) ASPM: - Delay pcie_link_state deallocation to avoid dangling pointers that cause invalid references during hot-unplug (Daniel Stodden) Power management: - Allow PCI bridges to go to D3Hot when suspending on all non-x86 systems (Manivannan Sadhasivam) Power control: - Create pwrctrl devices in pci_scan_device() to make it more symmetric with pci_pwrctrl_unregister() and make pwrctrl devices for PCI bridges possible (Manivannan Sadhasivam) - Unregister pwrctrl devices in pci_destroy_dev() so DOE, ASPM, etc. can still access devices after pci_stop_dev() (Manivannan Sadhasivam) - If there's a pwrctrl device for a PCI device, skip scanning it because the pwrctrl core will rescan the bus after the device is powered on (Manivannan Sadhasivam) - Add a pwrctrl driver for PCI slots based on voltage regulators described via devicetree (Manivannan Sadhasivam) Bandwidth control: - Add set_pcie_speed.sh to TEST_PROGS to fix issue when executing the set_pcie_cooling_state.sh test case (Yi Lai) - Avoid a NULL pointer dereference when we run out of bus numbers to assign for a bridge secondary bus (Lukas Wunner) Hotplug: - Drop superfluous pci_hotplug_slot_list, try_module_get() calls, and NULL pointer checks (Lukas Wunner) - Drop shpchp module init/exit logging, replace shpchp dbg() with ctrl_dbg(), and remove unused dbg(), err(), info(), warn() wrappers (Ilpo Järvinen) - Drop 'shpchp_debug' module parameter in favor of standard dynamic debugging (Ilpo Järvinen) - Drop unused cpcihp .get_power(), .set_power() function pointers (Guilherme Giacomo Simoes) - Disable hotplug interrupts in portdrv only when pciehp is not enabled to avoid issuing two hotplug commands too close together (Feng Tang) - Skip pciehp 'device replaced' check if the device has been removed to address a deadlock when resuming after a device was removed during system sleep (Lukas Wunner) - Don't enable pciehp hotplug interupt when resuming in poll mode (Ilpo Järvinen) Virtualization: - Fix bugs in 'pci=config_acs=' kernel command line parameter (Tushar Dave) DOE: - Expose supported DOE features via sysfs (Alistair Francis) - Allow DOE support to be enabled even if CXL isn't enabled (Alistair Francis) Endpoint framework: - Convert PCI device data so pci-epf-test works correctly on big-endian endpoint systems (Niklas Cassel) - Add BAR_RESIZABLE type to endpoint framework and add DWC core support for EPF drivers to set BAR_RESIZABLE type and size (Niklas Cassel) - Fix pci-epf-test double free that causes an oops if the host reboots and PERST# deassertion restarts endpoint BAR allocation (Christian Bruel) - Fix endpoint BAR testing so tests can skip disabled BARs instead of reporting them as failures (Niklas Cassel) - Widen endpoint test BAR size variable to accommodate BARs larger than INT_MAX (Niklas Cassel) - Remove unused tools 'pci' build target left over after moving tests to tools/testing/selftests/pci_endpoint (Jianfeng Liu) Altera PCIe controller driver: - Add DT binding and driver support for Agilex family (P-Tile, F-Tile, R-Tile) (Matthew Gerlach and D M, Sharath Kumar) AMD MDB PCIe controller driver: - Add DT binding and driver for AMD MDB (Multimedia DMA Bridge) (Thippeswamy Havalige) Broadcom STB PCIe controller driver: - Add BCM2712 MSI-X DT binding and interrupt controller drivers and add softdep on irq_bcm2712_mip driver to ensure that it is loaded first (Stanimir Varbanov) - Expand inbound window map to 64GB so it can accommodate BCM2712 (Stanimir Varbanov) - Add BCM2712 support and DT updates (Stanimir Varbanov) - Apply link speed restriction before bringing link up, not after (Jim Quinlan) - Update Max Link Speed in Link Capabilities via the internal writable register, not the read-only config register (Jim Quinlan) - Handle regulator_bulk_get() error to avoid panic when we call regulator_bulk_free() later (Jim Quinlan) - Disable regulators only when removing the bus immediately below a Root Port because we don't support regulators deeper in the hierarchy (Jim Quinlan) - Make const read-only arrays static (Colin Ian King) Cadence PCIe endpoint driver: - Correct MSG TLP generation so endpoints can generate INTx messages (Hans Zhang) Freescale i.MX6 PCIe controller driver: - Identify the second controller on i.MX8MQ based on devicetree 'linux,pci-domain' instead of DBI 'reg' address (Richard Zhu) - Remove imx_pcie_cpu_addr_fixup() since dwc core can now derive the ATU input address (using parent_bus_offset) from devicetree (Frank Li) Freescale Layerscape PCIe controller driver: - Drop deprecated 'num-ib-windows' and 'num-ob-windows' and unnecessary 'status' from example (Krzysztof Kozlowski) - Correct the syscon_regmap_lookup_by_phandle_args("fsl,pcie-scfg") arg_count to fix probe failure on LS1043A (Ioana Ciornei) HiSilicon STB PCIe controller driver: - Call phy_exit() to clean up if histb_pcie_probe() fails (Christophe JAILLET) Intel Gateway PCIe controller driver: - Remove intel_pcie_cpu_addr() since dwc core can now derive the ATU input address (using parent_bus_offset) from devicetree (Frank Li) Intel VMD host bridge driver: - Convert vmd_dev.cfg_lock from spinlock_t to raw_spinlock_t so pci_ops.read() will never sleep, even on PREEMPT_RT where spinlock_t becomes a sleepable lock, to avoid calling a sleeping function from invalid context (Ryo Takakura) MediaTek PCIe Gen3 controller driver: - Remove leftover mac_reset assert for Airoha EN7581 SoC (Lorenzo Bianconi) - Add EN7581 PBUS controller 'mediatek,pbus-csr' DT property and program host bridge memory aperture to this syscon node (Lorenzo Bianconi) Qualcomm PCIe controller driver: - Add qcom,pcie-ipq5332 binding (Varadarajan Narayanan) - Add qcom i.MX8QM and i.MX8QXP/DXP optional DMA interrupt (Alexander Stein) - Add optional dma-coherent DT property for Qualcomm SA8775P (Dmitry Baryshkov) - Make DT iommu property required for SA8775P and prohibited for SDX55 (Dmitry Baryshkov) - Add DT IOMMU and DMA-related properties for Qualcomm SM8450 (Dmitry Baryshkov) - Add endpoint DT properties for SAR2130P and enable endpoint mode in driver (Dmitry Baryshkov) - Describe endpoint BAR0 and BAR2 as 64-bit only and BAR1 and BAR3 as RESERVED (Manivannan Sadhasivam) Rockchip DesignWare PCIe controller driver: - Describe rk3568 and rk3588 BARs as Resizable, not Fixed (Niklas Cassel) Synopsys DesignWare PCIe controller driver: - Add debugfs-based Silicon Debug, Error Injection, Statistical Counter support for DWC (Shradha Todi) - Add debugfs property to expose LTSSM status of DWC PCIe link (Hans Zhang) - Add Rockchip support for DWC debugfs features (Niklas Cassel) - Add dw_pcie_parent_bus_offset() to look up the parent bus address of a specified 'reg' property and return the offset from the CPU physical address (Frank Li) - Use dw_pcie_parent_bus_offset() to derive CPU -> ATU addr offset via 'reg[config]' for host controllers and 'reg[addr_space]' for endpoint controllers (Frank Li) - Apply struct dw_pcie.parent_bus_offset in ATU users to remove use of .cpu_addr_fixup() when programming ATU (Frank Li) TI J721E PCIe driver: - Correct the 'link down' interrupt bit for J784S4 (Siddharth Vadapalli) TI Keystone PCIe controller driver: - Describe AM65x BARs 2 and 5 as Resizable (not Fixed) and reduce alignment requirement from 1MB to 64KB (Niklas Cassel) Xilinx Versal CPM PCIe controller driver: - Free IRQ domain in probe error path to avoid leaking it (Thippeswamy Havalige) - Add DT .compatible "xlnx,versal-cpm5nc-host" and driver support for Versal Net CPM5NC Root Port controller (Thippeswamy Havalige) - Add driver support for CPM5_HOST1 (Thippeswamy Havalige) Miscellaneous: - Convert fsl,mpc83xx-pcie binding to YAML (J. Neuschäfer) - Use for_each_available_child_of_node_scoped() to simplify apple, kirin, mediatek, mt7621, tegra drivers (Zhang Zekun)" * tag 'pci-v6.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (197 commits) PCI: layerscape: Fix arg_count to syscon_regmap_lookup_by_phandle_args() PCI: j721e: Fix the value of .linkdown_irq_regfield for J784S4 misc: pci_endpoint_test: Add support for PCITEST_IRQ_TYPE_AUTO PCI: endpoint: pci-epf-test: Expose supported IRQ types in CAPS register PCI: dw-rockchip: Endpoint mode cannot raise INTx interrupts PCI: endpoint: Add intx_capable to epc_features struct dt-bindings: PCI: Add common schema for devices accessible through PCI BARs PCI: intel-gw: Remove intel_pcie_cpu_addr() PCI: imx6: Remove imx_pcie_cpu_addr_fixup() PCI: dwc: Use parent_bus_offset to remove need for .cpu_addr_fixup() PCI: dwc: ep: Ensure proper iteration over outbound map windows PCI: dwc: ep: Use devicetree 'reg[addr_space]' to derive CPU -> ATU addr offset PCI: dwc: ep: Consolidate devicetree handling in dw_pcie_ep_get_resources() PCI: dwc: ep: Call epc_create() early in dw_pcie_ep_init() PCI: dwc: Use devicetree 'reg[config]' to derive CPU -> ATU addr offset PCI: dwc: Add dw_pcie_parent_bus_offset() checking and debug PCI: dwc: Add dw_pcie_parent_bus_offset() PCI/bwctrl: Fix NULL pointer dereference on bus number exhaustion PCI: xilinx-cpm: Add cpm_csr register mapping for CPM5_HOST1 variant PCI: brcmstb: Make const read-only arrays static ...
2025-03-21s390/pci: Support mmap() of PCI resources except for ISM devicesNiklas Schnelle
So far s390 does not allow mmap() of PCI resources to user-space via the usual mechanisms, though it does use it for RDMA. For the PCI sysfs resource files and /proc/bus/pci it defines neither HAVE_PCI_MMAP nor ARCH_GENERIC_PCI_MMAP_RESOURCE. For vfio-pci s390 previously relied on disabled VFIO_PCI_MMAP and now relies on setting pdev->non_mappable_bars for all devices. This is partly because access to mapped PCI resources from user-space requires special PCI load/store memory-I/O (MIO) instructions, or the special MMIO syscalls when these are not available. Still, such access is possible and useful not just for RDMA, in fact not being able to mmap() PCI resources has previously caused extra work when testing devices. One thing that doesn't work with PCI resources mapped to user-space though is the s390 specific virtual ISM device. Not only because the BAR size of 256 TiB prevents mapping the whole BAR but also because access requires use of the legacy PCI instructions which are not accessible to user-space on systems with the newer MIO PCI instructions. Now with the pdev->non_mappable_bars flag ISM can be excluded from mapping its resources while making this functionality available for all other PCI devices. To this end introduce a minimal implementation of PCI_QUIRKS and use that to set pdev->non_mappable_bars for ISM devices only. Then also set ARCH_GENERIC_PCI_MMAP_RESOURCE to take advantage of the generic implementation of pci_mmap_resource_range() enabling only the newer sysfs mmap() interface. This follows the recommendation in Documentation/PCI/sysfs-pci.rst. Link: https://lore.kernel.org/r/20250226-vfio_pci_mmap-v7-3-c5c0f1d26efd@linux.ibm.com Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-17s390/ism: add release function for struct deviceJulian Ruess
According to device_release() in /drivers/base/core.c, a device without a release function is a broken device and must be fixed. The current code directly frees the device after calling device_add() without waiting for other kernel parts to release their references. Thus, a reference could still be held to a struct device, e.g., by sysfs, leading to potential use-after-free issues if a proper release function is not set. Fixes: 8c81ba20349d ("net/smc: De-tangle ism and smc device initialization") Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: Julian Ruess <julianr@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250214120137.563409-1-wintera@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-30net/smc: decouple ism_client from SMC-D DMB registrationWen Gu
The struct 'ism_client' is specialized for s390 platform firmware ISM. So replace it with 'void' to make SMCD DMB registration helper generic for both Emulated-ISM and existing ISM. Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-and-tested-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-17s390/ism: Properly fix receive message buffer allocationGerd Bayer
Since [1], dma_alloc_coherent() does not accept requests for GFP_COMP anymore, even on archs that may be able to fulfill this. Functionality that relied on the receive buffer being a compound page broke at that point: The SMC-D protocol, that utilizes the ism device driver, passes receive buffers to the splice processor in a struct splice_pipe_desc with a single entry list of struct pages. As the buffer is no longer a compound page, the splice processor now rejects requests to handle more than a page worth of data. Replace dma_alloc_coherent() and allocate a buffer with folio_alloc and create a DMA map for it with dma_map_page(). Since only receive buffers on ISM devices use DMA, qualify the mapping as FROM_DEVICE. Since ISM devices are available on arch s390, only, and on that arch all DMA is coherent, there is no need to introduce and export some kind of dma_sync_to_cpu() method to be called by the SMC-D protocol layer. Analogously, replace dma_free_coherent by a two step dma_unmap_page, then folio_put to free the receive buffer. [1] https://lore.kernel.org/all/20221113163535.884299-1-hch@lst.de/ Fixes: c08004eede4b ("s390/ism: don't pass bogus GFP_ flags to dma_alloc_coherent") Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26net/smc: manage system EID in SMC stack instead of ISM driverWen Gu
The System EID (SEID) is an internal EID that is used by the SMCv2 software stack that has a predefined and constant value representing the s390 physical machine that the OS is executing on. So it should be managed by SMC stack instead of ISM driver and be consistent for all ISMv2 device (including virtual ISM devices) on s390 architecture. Suggested-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-and-tested-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26net/smc: compatible with 128-bits extended GID of virtual ISM deviceWen Gu
According to virtual ISM support feature defined by SMCv2.1, GIDs of virtual ISM device are UUIDs defined by RFC4122, which are 128-bits long. So some adaptation work is required. And note that the GIDs of existing platform firmware ISM devices still remain 64-bits long. Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-17s390/ism: ism driver implies smc protocolGerd Bayer
Since commit a72178cfe855 ("net/smc: Fix dependency of SMC on ISM") you can build the ism code without selecting the SMC network protocol. That leaves some ism functions be reported as unused. Move these functions under the conditional compile with CONFIG_SMC. Also codify the suggestion to also configure the SMC protocol in ism's Kconfig - but with an "imply" rather than a "select" as SMC depends on other config options and allow for a deliberate decision not to build SMC. Also, mention that in ISM's help. Fixes: a72178cfe855 ("net/smc: Fix dependency of SMC on ISM") Reported-by: Randy Dunlap <rdunlap@infradead.org> Closes: https://lore.kernel.org/netdev/afd142a2-1fa0-46b9-8b2d-7652d41d3ab8@infradead.org/ Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-08s390/ism: Do not unregister clients with registered DMBsNiklas Schnelle
When ism_unregister_client() is called but the client still has DMBs registered it returns -EBUSY and prints an error. This only happens after the client has already been unregistered however. This is unexpected as the unregister claims to have failed. Furthermore as this implies a client bug a WARN() is more appropriate. Thus move the deregistration after the check and use WARN(). Fixes: 89e7d2ba61b7 ("net/ism: Add new API for client registration") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-08s390/ism: Fix and simplify add()/remove() callback handlingNiklas Schnelle
Previously the clients_lock was protecting the clients array against concurrent addition/removal of clients but was also accessed from IRQ context. This meant that it had to be a spinlock and that the add() and remove() callbacks in which clients need to do allocation and take mutexes can't be called under the clients_lock. To work around this these callbacks were moved to workqueues. This not only introduced significant complexity but is also subtly broken in at least one way. In ism_dev_init() and ism_dev_exit() clients[i]->tgt_ism is used to communicate the added/removed ISM device to the work function. While write access to client[i]->tgt_ism is protected by the clients_lock and the code waits that there is no pending add/remove work before and after setting clients[i]->tgt_ism this is not enough. The problem is that the wait happens based on per ISM device counters. Thus a concurrent ism_dev_init()/ism_dev_exit() for a different ISM device may overwrite a clients[i]->tgt_ism between unlocking the clients_lock and the subsequent wait for the work to finnish. Thankfully with the clients_lock no longer held in IRQ context it can be turned into a mutex which can be held during the calls to add()/remove() completely removing the need for the workqueues and the associated broken housekeeping including the per ISM device counters and the clients[i]->tgt_ism. Fixes: 89e7d2ba61b7 ("net/ism: Add new API for client registration") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-08s390/ism: Fix locking for forwarding of IRQs and events to clientsNiklas Schnelle
The clients array references all registered clients and is protected by the clients_lock. Besides its use as general list of clients the clients array is accessed in ism_handle_irq() to forward ISM device events to clients. While the clients_lock is taken in the IRQ handler when calling handle_event() it is however incorrectly not held during the client->handle_irq() call and for the preceding clients[] access leaving it unprotected against concurrent client (un-)registration. Furthermore the accesses to ism->sba_client_arr[] in ism_register_dmb() and ism_unregister_dmb() are not protected by any lock. This is especially problematic as the client ID from the ism->sba_client_arr[] is not checked against NO_CLIENT and neither is the client pointer checked. Instead of expanding the use of the clients_lock further add a separate array in struct ism_dev which references clients subscribed to the device's events and IRQs. This array is protected by ism->lock which is already taken in ism_handle_irq() and can be taken outside the IRQ handler when adding/removing subscribers or the accessing ism->sba_client_arr[]. This also means that the clients_lock is no longer taken in IRQ context. Fixes: 89e7d2ba61b7 ("net/ism: Add new API for client registration") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: include/linux/mlx5/driver.h 617f5db1a626 ("RDMA/mlx5: Fix affinity assignment") dc13180824b7 ("net/mlx5: Enable devlink port for embedded cpu VF vports") https://lore.kernel.org/all/20230613125939.595e50b8@canb.auug.org.au/ tools/testing/selftests/net/mptcp/mptcp_join.sh 47867f0a7e83 ("selftests: mptcp: join: skip check if MIB counter not supported") 425ba803124b ("selftests: mptcp: join: support RM_ADDR for used endpoints or not") 45b1a1227a7a ("mptcp: introduces more address related mibs") 0639fa230a21 ("selftests: mptcp: add explicit check for new mibs") https://lore.kernel.org/netdev/20230609-upstream-net-20230610-mptcp-selftests-support-old-kernels-part-3-v1-0-2896fe2ee8a3@tessares.net/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15s390/ism: Fix trying to free already-freed IRQ by repeated ism_dev_exit()Julian Ruess
This patch prevents the system from crashing when unloading the ISM module. How to reproduce: Attach an ISM device and execute 'rmmod ism'. Error-Log: - Trying to free already-free IRQ 0 - WARNING: CPU: 1 PID: 966 at kernel/irq/manage.c:1890 free_irq+0x140/0x540 After calling ism_dev_exit() for each ISM device in the exit routine, pci_unregister_driver() will execute ism_remove() for each ISM device. Because ism_remove() also calls ism_dev_exit(), free_irq(pci_irq_vector(pdev, 0), ism) is called twice for each ISM device. This results in a crash with the error 'Trying to free already-free IRQ'. In the exit routine, it is enough to call pci_unregister_driver() because it ensures that ism_dev_exit() is called once per ISM device. Cc: <stable@vger.kernel.org> # 6.3+ Fixes: 89e7d2ba61b7 ("net/ism: Add new API for client registration") Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Julian Ruess <julianr@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-25s390/ism: Set DMA coherent maskNiklas Schnelle
A future change will convert the DMA API implementation from the architecture specific arch/s390/pci/pci_dma.c to using the common code drivers/iommu/dma-iommu.c which the utilizes the same IOMMU hardware through the s390-iommu driver. Unlike the s390 specific DMA API this requires devices to correctly set the coherent mask to be allowed to use IOVAs >2^32 in dma_alloc_coherent(). This was however not done for ISM devices. ISM requires such addresses since currently the DMA aperture for PCI devices starts at 2^32 and all calls to dma_alloc_coherent() would thus fail. Link: https://lore.kernel.org/all/20230310-dma_iommu-v9-1-65bb8edd2beb@linux.ibm.com/ Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Link: https://lore.kernel.org/r/20230524075411.3734141-1-schnelle@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-03-24net/ism: Remove redundant pci_clear_masterCai Huoqing
Remove pci_clear_master to simplify the code, the bus-mastering is also cleared in do_pci_disable_device, like this: ./drivers/pci/pci.c:2197 static void do_pci_disable_device(struct pci_dev *dev) { u16 pci_command; pci_read_config_word(dev, PCI_COMMAND, &pci_command); if (pci_command & PCI_COMMAND_MASTER) { pci_command &= ~PCI_COMMAND_MASTER; pci_write_config_word(dev, PCI_COMMAND, pci_command); } pcibios_disable_device(dev); }. And dev->is_busmaster is set to 0 in pci_disable_device. Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net/ism: Remove extra includeStefan Raspl
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net/smc: Introduce explicit check for v2 supportStefan Raspl
Previously, v2 support was derived from a very specific format of the SEID as part of the SMC-D codebase. Make this part of the SMC-D device API, so implementers do not need to adhere to a specific SEID format. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Reviewed-and-tested-by: Jan Karcher <jaka@linux.ibm.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net/smc: De-tangle ism and smc device initializationStefan Raspl
The struct device for ISM devices was part of struct smcd_dev. Move to struct ism_dev, provide a new API call in struct smcd_ops, and convert existing SMCD code accordingly. Furthermore, remove struct smcd_dev from struct ism_dev. This is the final part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25s390/ism: Consolidate SMC-D-related codeStefan Raspl
The ism module had SMC-D-specific code sprinkled across the entire module. We are now consolidating the SMC-D-specific parts into the latter parts of the module, so it becomes more clear what code is intended for use with ISM, and which parts are glue code for usage in the context of SMC-D. This is the fourth part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net/smc: Separate SMC-D and ISM APIsStefan Raspl
We separate the code implementing the struct smcd_ops API in the ISM device driver from the functions that may be used by other exploiters of ISM devices. Note: We start out small, and don't offer the whole breadth of the ISM device for public use, as many functions are specific to or likely only ever used in the context of SMC-D. This is the third part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net/smc: Register SMC-D as ISM clientStefan Raspl
Register the smc module with the new ism device driver API. This is the second part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net/ism: Add new API for client registrationStefan Raspl
Add a new API that allows other drivers to concurrently access ISM devices. To do so, we introduce a new API that allows other modules to register for ISM device usage. Furthermore, we move the GID to struct ism, where it belongs conceptually, and rename and relocate struct smcd_event to struct ism_event. This is the first part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25s390/ism: Introduce struct ism_dmbStefan Raspl
Conceptually, a DMB is a structure that belongs to ISM devices. However, SMC currently 'owns' this structure. So future exploiters of ISM devices would be forced to include SMC headers to work - which is just weird. Therefore, we switch ISM to struct ism_dmb, introduce a new public header with the definition (will be populated with further API calls later on), and, add a thin wrapper to please SMC. Since structs smcd_dmb and ism_dmb are identical, we can simply convert between the two for now. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net/ism: Add missing calls to disable bus-masteringStefan Raspl
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21s390/ism: don't pass bogus GFP_ flags to dma_alloc_coherentChristoph Hellwig
dma_alloc_coherent is an opaque allocator that only uses the GFP_ flags for allocation context control. Don't pass __GFP_COMP which makes no sense for an allocation that can't in any way be converted to a page pointer. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Wenjia Zhang <wenjia@linux.ibm.com>
2022-07-27net/smc: Pass on DMBE bit mask in IRQ handlerStefan Raspl
Make the DMBE bits, which are passed on individually in ism_move() as parameter idx, available to the receiver. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Wenjia Zhang < wenjia@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-27s390/ism: CleanupsStefan Raspl
Reworked signature of the function to retrieve the system EID: No plausible reason to use a double pointer. And neither to pass in the device as an argument, as this identifier is by definition per system, not per device. Plus some minor consistency edits. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Wenjia Zhang < wenjia@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14s390/ism: switch from 'pci_' to 'dma_' APIChristophe JAILLET
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below. @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-26s390/ism: fix incorrect system EIDKarsten Graul
The system EID that is defined by the ISM driver is not correct. Using an incorrect system EID allows to communicate with remote Linux systems that use the same incorrect system EID, but when it comes to interoperability with other operating systems then the system EIDs do never match which prevents SMC-Dv2 communication. Using the correct system EID fixes this problem. Fixes: 201091ebb2a1 ("net/smc: introduce System Enterprise ID (SEID)") Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-28net/smc: introduce CHID callback for ISM devicesUrsula Braun
With SMCD version 2 the CHIDs of ISM devices are needed for the CLC handshake. This patch provides the new callback to retrieve the CHID of an ISM device. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: introduce System Enterprise ID (SEID)Ursula Braun
SMCD version 2 defines a System Enterprise ID (short SEID). This patch contains the SEID creation and adds the callback to retrieve the created SEID. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27s390/ism: indicate correct error reason in ism_alloc_dmb()Karsten Graul
When the ism driver allocates a new dmb in ism_alloc_dmb() it must first check for and reserve a slot in the sba bitmap. When find_next_zero_bit() finds no free slot then the return code is -ENOMEM. This code conflicts with the error when the alloc() fails later in the code. As a result of that the caller can not differentiate between out-of-memory conditions and sba-bitmap-full conditions. Fix that by using the return code -ENOSPC when the sba slot reservation failed. Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-13s390/ism: fix error return code in ism_probe()Wei Yongjun
Fix to return negative error code -ENOMEM from the smcd_alloc_dev() error handling case instead of 0, as done elsewhere in this function. Fixes: 684b89bc39ce ("s390/ism: add device driver for internal shared memory") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-27s390/ism: remove pm supportUrsula Braun
As s390 no longer supports ARCH_HIBERNATION_POSSIBLE, drop the unused pm ops from the ism driver. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-04-29s390/ism: move oddities of device IO to wrapper functionSebastian Ott
ISM devices are special in how they access PCI memory space. Provide wrappers for handling commands to the device. No functional change. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-20s390/ism: ignore some errors during deregistrationSebastian Ott
Prior to dma unmap/free operations the ism driver tries to ensure that the memory is no longer accessed by the HW. When errors during deregistration of memory regions from the HW occur the ism driver will not unmap/free this memory. When we receive notification from the hypervisor that a PCI function has been detached we can no longer access the device and would never unmap/free these memory regions which led to complaints by the DMA debug API. Treat this kind of errors during the deregistration of memory regions from the HW as success since it is already ensured that the memory is no longer accessed by HW. Reported-by: Karsten Graul <kgraul@linux.ibm.com> Reported-by: Hans Wippel <hwippel@linux.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-08cross-tree: phase out dma_zalloc_coherent()Luis Chamberlain
We already need to zero out memory for dma_alloc_coherent(), as such using dma_zalloc_coherent() is superflous. Phase it out. This change was generated with the following Coccinelle SmPL patch: @ replace_dma_zalloc_coherent @ expression dev, size, data, handle, flags; @@ -dma_zalloc_coherent(dev, size, handle, flags) +dma_alloc_coherent(dev, size, handle, flags) Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> [hch: re-ran the script on the latest tree] Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-11-14s390/ism: clear dmbe_mask bit before SMC IRQ handlingUrsula Braun
SMC-D stress workload showed connection stalls. Since the firmware decides to skip raising an interrupt if the SBA DMBE mask bit is still set, this SBA DMBE mask bit should be cleared before the IRQ handling in the SMC code runs. Otherwise there are small windows possible with missing interrupts for incoming data. SMC-D currently does not care about the old value of the SBA DMBE mask. Acked-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10PCI: Remove pci_set_dma_max_seg_size()Christoph Hellwig
The few callers can just use dma_set_max_seg_size ()directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-10-10PCI: Remove pci_set_dma_seg_boundary()Christoph Hellwig
The two callers can just use dma_set_seg_boundary() directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-06-30s390/ism: add device driver for internal shared memorySebastian Ott
Add support for the Internal Shared Memory vPCI Adapter. This driver implements the interfaces of the SMC-D protocol. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>