summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)Author
2023-03-09Merge patch series "PCI/AER: Remove redundant Device Control Error Reporting ↵Martin K. Petersen
Enable" Bjorn Helgaas <helgaas@kernel.org> says: Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), which appeared in v6.0, the PCI core has enabled PCIe error reporting for all devices during enumeration. Remove driver code to do this and remove unnecessary includes of <linux/aer.h> from several other drivers. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: qla4xxx: Drop redundant pci_enable_pcie_error_reporting()Bjorn Helgaas
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since commit f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration, so the driver doesn't need to do it itself. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this only controls ERR_* Messages from the device. An ERR_* Message may cause the Root Port to generate an interrupt, depending on the AER Root Error Command register managed by the AER service driver. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230307182842.870378-11-helgaas@kernel.org Cc: Nilesh Javali <njavali@marvell.com> Cc: Manish Rangankar <mrangankar@marvell.com> Cc: GR-QLogic-Storage-Upstream@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: qla2xxx: Drop redundant pci_enable_pcie_error_reporting()Bjorn Helgaas
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since commit f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration, so the driver doesn't need to do it itself. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this only controls ERR_* Messages from the device. An ERR_* Message may cause the Root Port to generate an interrupt, depending on the AER Root Error Command register managed by the AER service driver. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230307182842.870378-10-helgaas@kernel.org Cc: Nilesh Javali <njavali@marvell.com> Cc: GR-QLogic-Storage-Upstream@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: mpt3sas: Drop redundant pci_enable_pcie_error_reporting()Bjorn Helgaas
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since commit f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration, so the driver doesn't need to do it itself. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this only controls ERR_* Messages from the device. An ERR_* Message may cause the Root Port to generate an interrupt, depending on the AER Root Error Command register managed by the AER service driver. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230307182842.870378-9-helgaas@kernel.org Cc: Sathya Prakash <sathya.prakash@broadcom.com> Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com> Cc: MPT-FusionLinux.pdl@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: lpfc: Drop redundant pci_enable_pcie_error_reporting()Bjorn Helgaas
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since commit f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration, so the driver doesn't need to do it itself. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this only controls ERR_* Messages from the device. An ERR_* Message may cause the Root Port to generate an interrupt, depending on the AER Root Error Command register managed by the AER service driver. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230307182842.870378-8-helgaas@kernel.org Cc: James Smart <james.smart@broadcom.com> Cc: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: hpsa: Remove unnecessary pci_disable_pcie_error_reporting() commentBjorn Helgaas
Commit 105a3dbc7452 ("hpsa: clean up driver init") added a comment about pci_disable_pcie_error_reporting(), but hpsa has never called either pci_enable_pcie_error_reporting() or pci_disable_pcie_error_reporting(). Since commit f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core is responsible for managing PCIe device error reporting. Remove the comment about pci_disable_pcie_error_reporting(). Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230307182842.870378-7-helgaas@kernel.org Cc: Don Brace <don.brace@microchip.com> Cc: storagedev@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: csiostor: Remove unnecessary aer.h includeBjorn Helgaas
<linux/aer.h> is unused, so remove it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230307182842.870378-6-helgaas@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: bfa: Drop redundant pci_enable_pcie_error_reporting()Bjorn Helgaas
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since commit f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration, so the driver doesn't need to do it itself. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this only controls ERR_* Messages from the device. An ERR_* Message may cause the Root Port to generate an interrupt, depending on the AER Root Error Command register managed by the AER service driver. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230307182842.870378-5-helgaas@kernel.org Cc: Anil Gurumurthy <anil.gurumurthy@qlogic.com> Cc: Sudarsana Kalluru <sudarsana.kalluru@qlogic.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: be2iscsi: Drop redundant pci_enable_pcie_error_reporting()Bjorn Helgaas
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since commit f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration, so the driver doesn't need to do it itself. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this only controls ERR_* Messages from the device. An ERR_* Message may cause the Root Port to generate an interrupt, depending on the AER Root Error Command register managed by the AER service driver. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230307182842.870378-4-helgaas@kernel.org Cc: Ketan Mukadam <ketan.mukadam@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: arcmsr: Remove unnecessary aer.h includeBjorn Helgaas
<linux/aer.h> is unused, so remove it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230307182842.870378-3-helgaas@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: aacraid: Drop redundant pci_enable_pcie_error_reporting()Bjorn Helgaas
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since commit f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration, so the driver doesn't need to do it itself. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Note that this only controls ERR_* Messages from the device. An ERR_* Message may cause the Root Port to generate an interrupt, depending on the AER Root Error Command register managed by the AER service driver. Also remove pci_disable_pcie_error_reporting() from the .error_detected() path, which was added by commit 5c63f7f710bd ("aacraid: Added EEH support") but looks unnecessary. Error reporting will be disabled by the device reset and will be re-enabled by the pci_restore_state() in aac_pci_slot_reset(). Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230307182842.870378-2-helgaas@kernel.org Cc: Adaptec OEM Raid Solutions <aacraid@microsemi.com> Cc: Raghava Aditya Renukunta <raghavaaditya.renukunta@pmcs.com> Cc: Tomas Henzl <thenzl@redhat.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09Merge patch series "Add poll support for hisi_sas v3 hw"Martin K. Petersen
chenxiang <chenxiang66@hisilicon.com> says: To support IO_URING IOPOLL support for hisi_sas, we need to: - Add and fill mq_poll interface to poll queue; - Ensure internal I/Os (including internal abort I/Os) are delivered and completed through non-iopoll queue (queue 0); Sending internal abort commands to non-poll queue actually requires to sending the abort command to every queue. This carries a a risk. Make iopoll support module parameter "experimental". I have tested performance on v3 hw with different modes as follows. 4K READs and 4K WRITEs both see an improvement when enabling poll mode: 4K READ 4K RANDREAD 4K WRITE 4K RANDWRITE interrupt + libaio 1770k 1316k 1197k 831k interrupt + io_uring 1848k 1390k 1238k 857k iopoll + io_uring 2117k 1364k 1874k 849k Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: hisi_sas: Add device attribute experimental_iopoll_q_cnt for v3 hwXiang Chen
Add device attribute experimental_iopoll_q_cnt to indicate how many iopoll queues are used for v3 hw. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1678169355-76215-5-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: hisi_sas: Sync complete queue for poll queueXiang Chen
Currently we sync irq to avoid freeing task before using task in I/O completion. After adding io_uring support, we need to do something similar for poll queues. As the process of CQ entries on poll queue are protected by spinlock cq->lock, we can use spin_lock() + spin_unlock() on cq->lock to make sure that CQ entries are processed to completion and then the complete queue is synced. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1678169355-76215-4-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: hisi_sas: Add poll support for v3 hwXiang Chen
Add a module parameter to set how many queues are used for iopoll. Also fill the interface mq_poll. For internal I/Os from libsas and libata we use non-iopoll queue (queue 0) to deliver and complete them. But for internal abort I/Os, just don't send them for poll queues. There is still a risk associated as this sends internal abort commands to non-iopoll queues which actually requires sending an internal abort command to every queue. As a result, make the module parameter as "experimental" for now. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1678169355-76215-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: hisi_sas: Add function complete_v3_hw()Xiang Chen
Put the work of processing cq slots in a separate function, complete_v3_hw(), which can then be used by cq_thread_v3_hw() and other functions when adding poll support. Co-developed-by: John Garry <john.garry@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1678169355-76215-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09Merge patch series "lpfc: Update lpfc to revision 14.2.0.11"Martin K. Petersen
Justin Tee <justin.tee@broadcom.com> says: Update lpfc to revision 14.2.0.11 This patch set contains bug fixes for buffer overflow, resource management, discovery, and HBA error status handling. The patches were cut against Martin's 6.3/scsi-queue tree. Justin Tee (10): lpfc: Protect against potential lpfc_debugfs_lockstat_write buffer overflow lpfc: Reorder freeing of various dma buffers and their list removal lpfc: Fix lockdep warning for rx_monitor lock when unloading driver lpfc: Record LOGO state with discovery engine even if aborted lpfc: Defer issuing new PLOGI if received RSCN before completing REG_LOGIN lpfc: Correct used_rpi count when devloss tmo fires with no recovery lpfc: Skip waiting for register ready bits when in unrecoverable state lpfc: Revise lpfc_error_lost_link reason code evaluation logic lpfc: Update lpfc version to 14.2.0.11 lpfc: Copyright updates for 14.2.0.11 patches Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: lpfc: Copyright updates for 14.2.0.11 patchesJustin Tee
Update copyrights to 2023 for files modified in the 14.2.0.11 patch set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230301231626.9621-11-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: lpfc: Update lpfc version to 14.2.0.11Justin Tee
Update lpfc version to 14.2.0.11. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230301231626.9621-10-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: lpfc: Revise lpfc_error_lost_link() reason code evaluation logicJustin Tee
Extended status reason code errors should mask off the IOERR_PARAM_MASK before checking strict equalities for IOERR values. Update the lpfc_error_lost_link() routine as such. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230301231626.9621-9-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: lpfc: Skip waiting for register ready bits when in unrecoverable stateJustin Tee
During tolerance tests that force an HBA to become unresponsive, rmmod hangs resulting in the inability to remove the driver. The lpfc_pci_remove_one_s4() routine attempts to submit a clean up mailbox command via the lpfc_sli4_post_sync_mbox() routine, but ends up waiting forever for a mailbox register to set its ready bit. Because the HBA is in an unrecoverable and unresponsive state, the ready bit will never be set. Create a new routine called lpfc_sli4_unrecoverable_port(), which checks a port status register's error notification bits. Use the lpfc_sli4_unrecoverable_port() routine in ready bit check routines to early return error if port is deemed unrecoverable. Also, when the lpfc_handle_eratt_s4() handler detects an unrecoverable state, call the lpfc_sli4_offline_eratt() routine to kick off flushing outstanding I/O. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230301231626.9621-8-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: lpfc: Correct used_rpi count when devloss tmo fires with no recoveryJustin Tee
A fabric controller can sometimes send an RDP request right before a link down event. Because of this outstanding RDP request, the driver does not remove the last reference count on its ndlp causing a potential leak of RPI resources when devloss tmo fires. In lpfc_cmpl_els_rsp(), modify the NPIV clause to always allow the lpfc_drop_node() routine to execute when not registered with SCSI transport. This relaxes the contraint that an NPIV ndlp must be in a specific state in order to call lpfc_drop node. Logic is revised such that the lpfc_drop_node() routine is always called to ensure the last ndlp decrement occurs. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230301231626.9621-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: lpfc: Defer issuing new PLOGI if received RSCN before completing REG_LOGINJustin Tee
When mapped to a target with multiple virtual ports, a link bounce sometimes results in unsuccessful rediscovery of all of the target's virtual ports. This is because a succession of repeat RSCNs for the virtual target ports leaves ndlps in the REG_LOGIN state with the NLP_REG_LOGIN_SEND flag set. With NLP_REG_LOGIN_SEND set, during the next PLOGI, the driver will UNREG_RPI. When UNREG_RPI is processed, the driver can be in the middle of PRLI_ISSUE or MAPPED state resulting in an illegal state transition by the discovery engine and stalling. Fix by calling the discovery state machine with DEVICE_RECOVERY event during RSCN processing. This will set the NLP_IGNR_REG_CMPL bit and prevent the old REG_LOGIN state from advancing. Then for the new PLOGI issue, add the check for the NLP_IGNR_REG_CMPL bit to delay issuing the new PLOGI until the queued REG_LOGIN and UNREG_LOGIN have been processed. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230301231626.9621-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: lpfc: Record LOGO state with discovery engine even if abortedJustin Tee
A target vendor array reboot in P2P topology can sometimes result in unsuccessful rediscovery. Rework the lpfc_cmpl_els_logo() routine such that when the LOGO completes as a failure because of driver abort, the LOGO state is still recorded with the discovery state machine. This is a small rework to set LOGO completion without forcing a device removal state change. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230301231626.9621-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: lpfc: Fix lockdep warning for rx_monitor lock when unloading driverJustin Tee
Lockdep enabled kernels report a theoretical deadlock state where the cmf_timer interrupt occurs while the rx_monitor ring is being destroyed. During rmmod, the cmf_timer is cancelled prior to the lpfc_rx_monitor_destroy_ring call. This actually eliminates the need to take the rx_monitor ring lock in lpfc_rx_monitor_destroy_ring. Thus, just remove lock/unlock of rx_monitor in lpfc_rx_monitor_destroy_ring. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230301231626.9621-4-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: lpfc: Reorder freeing of various DMA buffers and their list removalJustin Tee
Code sections where DMA resources are freed before list removal are reworked to ensure item removal before being freed. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230301231626.9621-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: lpfc: Prevent lpfc_debugfs_lockstat_write() buffer overflowJustin Tee
A static code analysis tool flagged the possibility of buffer overflow when using copy_from_user() for a debugfs entry. Currently, it is possible that copy_from_user() copies more bytes than what would fit in the mybuf char array. Add a min() restriction check between sizeof(mybuf) - 1 and nbytes passed from the userspace buffer to protect against buffer overflow. Link: https://lore.kernel.org/r/20230301231626.9621-2-justintee8345@gmail.com Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: be2iscsi: Remove unused variable internal_page_offsetJiapeng Chong
The variable 'internal_page_offset' is not used. Delete it. drivers/scsi/be2iscsi/be_cmds.c:1176:6: warning: variable 'internal_page_offset' set but not used. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4011 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20230209035224.90327-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: core: Fix a procfs host directory removal regressionBart Van Assche
scsi_proc_hostdir_rm() decreases a reference counter and hence must only be called once per host that is removed. This change does not require a scsi_add_host_with_dma() change since scsi_add_host_with_dma() will return 0 (success) if scsi_proc_host_add() is called. Fixes: fc663711b944 ("scsi: core: Remove the /proc/scsi/${proc_name} directory earlier") Cc: John Garry <john.g.garry@oracle.com> Reported-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/all/ed6b8027-a9d9-1b45-be8e-df4e8c6c4605@oracle.com/ Reported-by: syzbot+645a4616b87a2f10e398@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-scsi/000000000000890fab05f65342b6@google.com/ Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230307214428.3703498-1-bvanassche@acm.org Tested-by: John Garry <john.g.garry@oracle.com> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-09scsi: core: Add BLIST_NO_VPD_SIZE for some VDASDLee Duncan
Some storage, such as AIX VDASD (virtual storage) and IBM 2076 (front end), fail as a result of commit c92a6b5d6335 ("scsi: core: Query VPD size before getting full page"). That commit changed getting SCSI VPD pages so that we now read just enough of the page to get the actual page size, then read the whole page in a second read. The problem is that the above mentioned hardware returns zero for the page size, because of a firmware error. In such cases, until the firmware is fixed, this new blacklist flag says to revert to the original method of reading the VPD pages, i.e. try to read a whole buffer's worth on the first try. [mkp: reworked somewhat] Fixes: c92a6b5d6335 ("scsi: core: Query VPD size before getting full page") Reported-by: Martin Wilck <mwilck@suse.com> Suggested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Lee Duncan <lduncan@suse.com> Link: https://lore.kernel.org/r/20220928181350.9948-1-leeman.duncan@gmail.com Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-07scsi: mpi3mr: Fix expander node leak in mpi3mr_remove()Tomas Henzl
Add a missing resource clean up in .remove. Fixes: e22bae30667a ("scsi: mpi3mr: Add expander devices to STL") Signed-off-by: Tomas Henzl <thenzl@redhat.com> Link: https://lore.kernel.org/r/20230302234336.25456-7-thenzl@redhat.com Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-07scsi: mpi3mr: Fix memory leaks in mpi3mr_init_ioc()Tomas Henzl
Don't allocate memory again when IOC is being reinitialized. Fixes: fe6db6151565 ("scsi: mpi3mr: Handle offline FW activation in graceful manner") Signed-off-by: Tomas Henzl <thenzl@redhat.com> Link: https://lore.kernel.org/r/20230302234336.25456-6-thenzl@redhat.com Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-07scsi: mpi3mr: Fix sas_hba.phy memory leak in mpi3mr_remove()Tomas Henzl
Free mrioc->sas_hba.phy at .remove. Fixes: 42fc9fee116f ("scsi: mpi3mr: Add helper functions to manage device's port") Signed-off-by: Tomas Henzl <thenzl@redhat.com> Link: https://lore.kernel.org/r/20230302234336.25456-5-thenzl@redhat.com Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-07scsi: mpi3mr: Fix mpi3mr_hba_port memory leak in mpi3mr_remove()Tomas Henzl
Free mpi3mr_hba_port at .remove. Fixes: 42fc9fee116f ("scsi: mpi3mr: Add helper functions to manage device's port") Signed-off-by: Tomas Henzl <thenzl@redhat.com> Link: https://lore.kernel.org/r/20230302234336.25456-4-thenzl@redhat.com Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-07scsi: mpi3mr: Fix config page DMA memory leakTomas Henzl
A fix for: DMA-API: pci 0000:83:00.0: device driver has pending DMA allocations while released from device [count=1] Fixes: 32d457d5a2af ("scsi: mpi3mr: Add framework to issue config requests") Signed-off-by: Tomas Henzl <thenzl@redhat.com> Link: https://lore.kernel.org/r/20230302234336.25456-3-thenzl@redhat.com Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-07scsi: mpi3mr: Fix throttle_groups memory leakTomas Henzl
Add a missing kfree(). Fixes: f10af057325c ("scsi: mpi3mr: Resource Based Metering") Signed-off-by: Tomas Henzl <thenzl@redhat.com> Link: https://lore.kernel.org/r/20230302234336.25456-2-thenzl@redhat.com Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-07scsi: mpt3sas: Fix NULL pointer access in mpt3sas_transport_port_add()Wenchao Hao
Port is allocated by sas_port_alloc_num() and rphy is allocated by either sas_end_device_alloc() or sas_expander_alloc(), all of which may return NULL. So we need to check the rphy to avoid possible NULL pointer access. If sas_rphy_add() returned with failure, rphy is set to NULL. We would access the rphy in the following lines which would also result NULL pointer access. Fixes: 78316e9dfc24 ("scsi: mpt3sas: Fix possible resource leaks in mpt3sas_transport_port_add()") Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20230225100135.2109330-1-haowenchao2@huawei.com Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06scsi: sd: Fix wrong zone_write_granularity value during revalidateShin'ichiro Kawasaki
When the sd driver revalidates host-managed SMR disks, it calls disk_set_zoned() which changes the zone_write_granularity attribute value to the logical block size regardless of the device type. After that, the sd driver overwrites the value in sd_zbc_read_zone() with the physical block size, since ZBC/ZAC requires this for host-managed disks. Between the calls to disk_set_zoned() and sd_zbc_read_zone(), there exists a window where the attribute shows the logical block size as the zone_write_granularity value, which is wrong for host-managed disks. The duration of the window is from 20ms to 200ms, depending on report zone command execution time. To avoid the wrong zone_write_granularity value between disk_set_zoned() and sd_zbc_read_zone(), modify the value not in sd_zbc_read_zone() but just after disk_set_zoned() call. Fixes: a805a4fa4fa3 ("block: introduce zone_write_granularity limit") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20230306063024.3376959-1-shinichiro.kawasaki@wdc.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06scsi: storvsc: Handle BlockSize change in Hyper-V VHD/VHDX fileMichael Kelley
Hyper-V uses a VHD or VHDX file on the host as the underlying storage for a virtual disk. The VHD/VHDX file format is a sparse format where real disk space on the host is assigned in chunks that the VHD/VHDX file format calls the BlockSize. This BlockSize is not to be confused with the 512-byte (or 4096-byte) sector size of the underlying storage device. The default block size for a new VHD/VHDX file is 32 Mbytes. When a guest VM touches any disk space within a 32 Mbyte chunk of the VHD/VHDX file, Hyper-V allocates 32 Mbytes of real disk space for that section of the VHD/VHDX. Similarly, if a discard operation is done that covers an entire 32 Mbyte chunk, Hyper-V will free the real disk space for that portion of the VHD/VHDX. This BlockSize is surfaced in Linux as the "discard_granularity" in /sys/block/sd<x>/queue, which makes sense. Hyper-V also has differencing disks that can overlay a VHD/VHDX file to capture changes to the VHD/VHDX while preserving the original VHD/VHDX. One example of this differencing functionality is for VM snapshots. When a snapshot is created, a differencing disk is created. If the snapshot is rolled back, Hyper-V can just delete the differencing disk, and the VM will see the original disk contents at the time the snapshot was taken. Differencing disks are used in other scenarios as well. The BlockSize for a differencing disk defaults to 2 Mbytes, not 32 Mbytes. The smaller default is used because changes to differencing disks are typically scattered all over, and Hyper-V doesn't want to allocate 32 Mbytes of real disk space for a stray write here or there. The smaller BlockSize provides more efficient use of real disk space. When a differencing disk is added to a VHD/VHDX, Hyper-V reports UNIT_ATTENTION with a sense code indicating "Operating parameters have changed", because the value of discard_granularity should be changed to 2 Mbytes. When the differencing disk is removed, discard_granularity should be changed back to 32 Mbytes. However, current code simply reports a message from scsi_report_sense() and the value of /sys/block/sd<x>/queue/discard_granularity is not updated. The message isn't very actionable by a sysadmin. Fix this by having the storvsc driver check for the sense code indicating that the underly VHD/VHDX block size has changed, and do a rescan of the device to pick up the new discard_granularity. With this change the entire transition to/from differencing disks is handled automatically and transparently, with no confusing messages being output. Link: https://lore.kernel.org/r/1677516514-86060-1-git-send-email-mikelley@microsoft.com Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06scsi: megaraid_sas: Driver version update to 07.725.01.00-rc1Chandrakanth Patil
Update driver version. Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Link: https://lore.kernel.org/r/20230302105342.34933-4-chandrakanth.patil@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06scsi: megaraid_sas: Add crash dump mode capability bit in MFI capabilitiesChandrakanth Patil
In kdump kernel mode, the driver works in reduced functionality mode with some features disabled such as reduced MSI-X count and RDPQ disabled, etc. However, the firmware is not aware of this mode in some cases, which results in undefined behavior. To address this, the driver informs the firmware about the kdump mode through MPI capabilities bit during driver initialization. This allows firmware to adjust its behavior accordingly. Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20230302105342.34933-3-chandrakanth.patil@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06scsi: megaraid_sas: Update max supported LD IDs to 240Chandrakanth Patil
The firmware only supports Logical Disk IDs up to 240 and LD ID 255 (0xFF) is reserved for deleted LDs. However, in some cases, firmware was assigning LD ID 254 (0xFE) to deleted LDs and this was causing the driver to mark the wrong disk as deleted. This in turn caused the wrong disk device to be taken offline by the SCSI midlayer. To address this issue, limit the LD ID range from 255 to 240. This ensures the deleted LD ID is properly identified and removed by the driver without accidently deleting any valid LDs. Fixes: ae6874ba4b43 ("scsi: megaraid_sas: Early detection of VD deletion through RaidMap update") Reported-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20230302105342.34933-2-chandrakanth.patil@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06scsi: mpi3mr: Bad drive in topology results kernel crashRanjan Kumar
When the SAS Transport Layer support is enabled and a device exposed to the OS by the driver fails INQUIRY commands, the driver frees up the memory allocated for an internal HBA port data structure. However, in some places, the reference to the freed memory is not cleared. When the firmware sends the Device Info change event for the same device again, the freed memory is accessed and that leads to memory corruption and OS crash. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Link: https://lore.kernel.org/r/20230228140835.4075-7-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06scsi: mpi3mr: NVMe command size greater than 8K failsRanjan Kumar
A wrong variable is checked while populating PRP entries in the PRP page and this results in failure. No PRP entries in the PRP page were successfully created and any NVMe Encapsulated commands with PRP of size greater than 8K failed. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Link: https://lore.kernel.org/r/20230228140835.4075-6-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06scsi: mpi3mr: Return proper values for failures in firmware init pathRanjan Kumar
Return proper non-zero return values for all the cases when the controller initialization and re-initialization fails. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Link: https://lore.kernel.org/r/20230228140835.4075-5-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06scsi: mpi3mr: Wait for diagnostic save during controller initRanjan Kumar
If a controller reset operation is triggered to recover the controller from a fault state, then wait for the snapdump to be saved in the firmware region before proceeding to reset the controller. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Link: https://lore.kernel.org/r/20230228140835.4075-4-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06scsi: mpi3mr: Driver unload crashes host when enhanced logging is enabledRanjan Kumar
Prevent driver from trying to dereference a NULL pointer in a debug print while removing a device during driver unload. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Link: https://lore.kernel.org/r/20230228140835.4075-3-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06scsi: mpi3mr: ioctl timeout when disabling/enabling interruptRanjan Kumar
As part of Task Management handling, the driver will disable and enable the MSIx index zero which belongs to the Admin reply queue. During this transition the driver loses some interrupts and this leads to Admin request and ioctl timeouts. After enabling the interrupts, poll the Admin reply queue to avoid timeouts. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Link: https://lore.kernel.org/r/20230228140835.4075-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06scsi: lpfc: Avoid usage of list iterator variable after loopJakob Koschel
If the &epd_pool->list is empty when executing lpfc_get_io_buf_from_expedite_pool() the function would return an invalid pointer. Even in the case if the list is guaranteed to be populated, the iterator variable should not be used after the loop to be more robust for future changes. Linus proposed to avoid any use of the list iterator variable after the loop, in the attempt to move the list iterator variable declaration into the macro to avoid any potential misuse after the loop [1]. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel <jkl820.git@gmail.com> Link: https://lore.kernel.org/r/20230301-scsi-lpfc-avoid-list-iterator-after-loop-v1-1-325578ae7561@gmail.com Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-03-06scsi: lpfc: Check kzalloc() in lpfc_sli4_cgn_params_read()Justin Tee
If kzalloc() fails in lpfc_sli4_cgn_params_read(), then we rely on lpfc_read_object()'s routine to NULL check pdata. Currently, an early return error is thrown from lpfc_read_object() to protect us from NULL ptr dereference, but the errno code is -ENODEV. Change the errno code to a more appropriate -ENOMEM. Reported-by: Kang Chen <void0red@gmail.com> Link: https://lore.kernel.org/all/20230226102338.3362585-1-void0red@gmail.com Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230228044336.5195-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>