summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)Author
2025-07-10scsi: lpfc: Restore clearing of NLP_UNREG_INP in ndlp->nlp_flagEwan D. Milne
[ Upstream commit 040492ac2578b66d3ff4dcefb4f56811634de53d ] Commit 32566a6f1ae5 ("scsi: lpfc: Remove NLP_RELEASE_RPI flag from nodelist structure") introduced a regression with SLI-3 adapters (e.g. LPe12000 8Gb) where a Link Down / Link Up such as caused by disabling an host FC switch port would result in the devices remaining in the transport-offline state and multipath reporting them as failed. This problem was not seen with newer SLI-4 adapters. The problem was caused by portions of the patch which removed the functions __lpfc_sli_rpi_release() and lpfc_sli_rpi_release() and all their callers. This was presumably because with the removal of the NLP_RELEASE_RPI flag there was no need to free the rpi. However, __lpfc_sli_rpi_release() and lpfc_sli_rpi_release() which calls it reset the NLP_UNREG_INP flag. And, lpfc_sli_def_mbox_cmpl() has a path where __lpfc_sli_rpi_release() was called in a particular case where NLP_UNREG_INP was not otherwise cleared because of other conditions. Restoring the else clause of this conditional and simply clearing the NLP_UNREG_INP flag appears to resolve the problem with SLI-3 adapters. It should be noted that the code path in question is not specific to SLI-3, but there are other SLI-4 code paths which may have masked the issue. Fixes: 32566a6f1ae5 ("scsi: lpfc: Remove NLP_RELEASE_RPI flag from nodelist structure") Cc: stable@vger.kernel.org Tested-by: Marco Patalano <mpatalan@redhat.com> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Link: https://lore.kernel.org/r/20250317163731.356873-1-emilne@redhat.com Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-10scsi: lpfc: Avoid potential ndlp use-after-free in dev_loss_tmo_callbkJustin Tee
[ Upstream commit b5162bb6aa1ec04dff4509b025883524b6d7e7ca ] Smatch detected a potential use-after-free of an ndlp oject in dev_loss_tmo_callbk during driver unload or fatal error handling. Fix by reordering code to avoid potential use-after-free if initial nodelist reference has been previously removed. Fixes: 4281f44ea8bf ("scsi: lpfc: Prevent NDLP reference count underflow in dev_loss_tmo callback") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-scsi/41c1d855-9eb5-416f-ac12-8b61929201a3@stanley.mountain/ Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250425194806.3585-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-10scsi: lpfc: Change lpfc_nodelist nlp_flag member into a bitmaskJustin Tee
[ Upstream commit 92b99f1a73b7951f05590fd01640dcafdbc82c21 ] In attempt to reduce the amount of unnecessary ndlp->lock acquisitions in the lpfc driver, change nlpa_flag into an unsigned long bitmask and use clear_bit/test_bit bitwise atomic APIs instead of reliance on ndlp->lock for synchronization. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20241031223219.152342-10-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: b5162bb6aa1e ("scsi: lpfc: Avoid potential ndlp use-after-free in dev_loss_tmo_callbk") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-10scsi: lpfc: Remove NLP_RELEASE_RPI flag from nodelist structureJustin Tee
[ Upstream commit 32566a6f1ae558d0e79fed6e17a75c253367a57f ] An RPI is tightly bound to an NDLP structure and is freed only upon release of an NDLP object. As such, there should be no logic that frees an RPI outside of the lpfc_nlp_release() routine. In order to reinforce the original design usage of RPIs, remove the NLP_RELEASE_RPI flag and related logic. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20241031223219.152342-9-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: b5162bb6aa1e ("scsi: lpfc: Avoid potential ndlp use-after-free in dev_loss_tmo_callbk") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-10scsi: sd: Fix VPD page 0xb7 length checkjackysliu
[ Upstream commit 8889676cd62161896f1d861ce294adc29c4f2cb5 ] sd_read_block_limits_ext() currently assumes that vpd->len excludes the size of the page header. However, vpd->len describes the size of the entire VPD page, therefore the sanity check is incorrect. In practice this is not really a problem since we don't attach VPD pages unless they actually report data trailing the header. But fix the length check regardless. This issue was identified by Wukong-Agent (formerly Tencent Woodpecker), a code security AI agent, through static code analysis. [mkp: rewrote patch description] Signed-off-by: jackysliu <1972843537@qq.com> Link: https://lore.kernel.org/r/tencent_ADA5210D1317EEB6CD7F3DE9FE9DA4591D05@qq.com Fixes: 96b171d6dba6 ("scsi: core: Query the Block Limits Extension VPD page") Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-10scsi: qla4xxx: Fix missing DMA mapping error in qla4xxx_alloc_pdu()Thomas Fourier
[ Upstream commit 00f452a1b084efbe8dcb60a29860527944a002a1 ] dma_map_XXX() can fail and should be tested for errors with dma_mapping_error(). Fixes: b3a271a94d00 ("[SCSI] qla4xxx: support iscsiadm session mgmt") Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Link: https://lore.kernel.org/r/20250618071742.21822-2-fourier.thomas@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-10scsi: qla2xxx: Fix DMA mapping test in qla24xx_get_port_database()Thomas Fourier
[ Upstream commit c3b214719a87735d4f67333a8ef3c0e31a34837c ] dma_map_XXX() functions return as error values DMA_MAPPING_ERROR which is often ~0. The error value should be tested with dma_mapping_error() like it was done in qla26xx_dport_diagnostics(). Fixes: 818c7f87a177 ("scsi: qla2xxx: Add changes in preparation for vendor extended FDMI/RDP") Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Link: https://lore.kernel.org/r/20250617161115.39888-2-fourier.thomas@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-06scsi: megaraid_sas: Fix invalid node indexChen Yu
commit 752eb816b55adb0673727ba0ed96609a17895654 upstream. On a system with DRAM interleave enabled, out-of-bound access is detected: megaraid_sas 0000:3f:00.0: requested/available msix 128/128 poll_queue 0 ------------[ cut here ]------------ UBSAN: array-index-out-of-bounds in ./arch/x86/include/asm/topology.h:72:28 index -1 is out of range for type 'cpumask *[1024]' dump_stack_lvl+0x5d/0x80 ubsan_epilogue+0x5/0x2b __ubsan_handle_out_of_bounds.cold+0x46/0x4b megasas_alloc_irq_vectors+0x149/0x190 [megaraid_sas] megasas_probe_one.cold+0xa4d/0x189c [megaraid_sas] local_pci_probe+0x42/0x90 pci_device_probe+0xdc/0x290 really_probe+0xdb/0x340 __driver_probe_device+0x78/0x110 driver_probe_device+0x1f/0xa0 __driver_attach+0xba/0x1c0 bus_for_each_dev+0x8b/0xe0 bus_add_driver+0x142/0x220 driver_register+0x72/0xd0 megasas_init+0xdf/0xff0 [megaraid_sas] do_one_initcall+0x57/0x310 do_init_module+0x90/0x250 init_module_from_file+0x85/0xc0 idempotent_init_module+0x114/0x310 __x64_sys_finit_module+0x65/0xc0 do_syscall_64+0x82/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fix it accordingly. Signed-off-by: Chen Yu <yu.c.chen@intel.com> Link: https://lore.kernel.org/r/20250604042556.3731059-1-yu.c.chen@intel.com Fixes: 8049da6f3943 ("scsi: megaraid_sas: Use irq_set_affinity_and_hint()") Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-27scsi: elx: efct: Fix memory leak in efct_hw_parse_filter()Vitaliy Shevtsov
[ Upstream commit 2a8a5a5dd06eef580f9818567773fd75057cb875 ] strsep() modifies the address of the pointer passed to it so that it no longer points to the original address. This means kfree() gets the wrong pointer. Fix this by passing unmodified pointer returned from kstrdup() to kfree(). Found by Linux Verification Center (linuxtesting.org) with Svace. Fixes: 4df84e846624 ("scsi: elx: efct: Driver initialization routines") Signed-off-by: Vitaliy Shevtsov <v.shevtsov@mt-integration.ru> Link: https://lore.kernel.org/r/20250612163616.24298-1-v.shevtsov@mt-integration.ru Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-27scsi: storvsc: Increase the timeouts to storvsc_timeoutDexuan Cui
commit b2f966568faaad326de97481096d0f3dc0971c43 upstream. Currently storvsc_timeout is only used in storvsc_sdev_configure(), and 5s and 10s are used elsewhere. It turns out that rarely the 5s is not enough on Azure, so let's use storvsc_timeout everywhere. In case a timeout happens and storvsc_channel_init() returns an error, close the VMBus channel so that any host-to-guest messages in the channel's ringbuffer, which might come late, can be safely ignored. Add a "const" to storvsc_timeout. Cc: stable@kernel.org Signed-off-by: Dexuan Cui <decui@microsoft.com> Link: https://lore.kernel.org/r/1749243459-10419-1-git-send-email-decui@microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-27scsi: lpfc: Use memcpy() for BIOS versionDaniel Wagner
[ Upstream commit ae82eaf4aeea060bb736c3e20c0568b67c701d7d ] The strlcat() with FORTIFY support is triggering a panic because it thinks the target buffer will overflow although the correct target buffer size is passed in. Anyway, instead of memset() with 0 followed by a strlcat(), just use memcpy() and ensure that the resulting buffer is NULL terminated. BIOSVersion is only used for the lpfc_printf_log() which expects a properly terminated string. Signed-off-by: Daniel Wagner <wagi@kernel.org> Link: https://lore.kernel.org/r/20250409-fix-lpfc-bios-str-v1-1-05dac9e51e13@kernel.org Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-27scsi: smartpqi: Add new PCI IDsDavid Strahan
[ Upstream commit 01b8bdddcfab035cf70fd9981cb20593564cd15d ] Add in support for more PCI devices. All PCI ID entries in Hex. Add PCI IDs for Ramaxel controllers: VID / DID / SVID / SDID ---- ---- ---- ---- Ramaxel SmartHBA RX8238-16i 9005 028f 1018 8238 Ramaxel SSSRAID card 9005 028f 1f3f 0610 Add PCI ID for Alibaba controller: VID / DID / SVID / SDID ---- ---- ---- ---- HBA AS1340 9005 028f 1ded 3301 Add PCI IDs for Inspur controller: VID / DID / SVID / SDID ---- ---- ---- ---- RT0800M6E2i 9005 028f 1bd4 00a3 Add PCI IDs for Delta controllers: VID / DID / SVID / SDID ---- ---- ---- ---- ThinkSystem 4450-8i SAS/SATA/NVMe PCIe Gen4 9005 028f 1d49 0222 24Gb HBA ThinkSystem 4450-16i SAS/SATA/NVMe PCIe Gen4 9005 028f 1d49 0223 24Gb HBA ThinkSystem 4450-8e SAS/SATA PCIe Gen4 9005 028f 1d49 0224 24Gb HBA ThinkSystem RAID 4450-16e PCIe Gen4 24Gb 9005 028f 1d49 0225 Adapter HBA ThinkSystem RAID 5450-16i PCIe Gen4 24Gb Adapter 9005 028f 1d49 0521 ThinkSystem RAID 9450-8i 4GB Flash PCIe Gen4 9005 028f 1d49 0624 24Gb Adapter ThinkSystem RAID 9450-16i 4GB Flash PCIe Gen4 9005 028f 1d49 0625 24Gb Adapter ThinkSystem RAID 9450-16i 4GB Flash PCIe Gen4 9005 028f 1d49 0626 24Gb Adapter ThinkSystem RAID 9450-32i 8GB Flash PCIe Gen4 9005 028f 1d49 0627 24Gb Adapter ThinkSystem RAID 9450-16e 4GB Flash PCIe Gen4 9005 028f 1d49 0628 24Gb Adapter Add PCI ID for Cloudnine Controller: VID / DID / SVID / SDID ---- ---- ---- ---- SmartHBA P6600-24i 9005 028f 1f51 100b Add PCI IDs for Hurraydata Controllers: VID / DID / SVID / SDID ---- ---- ---- ---- HRDT TrustHBA H4100-8i 9005 028f 207d 4044 HRDT TrustHBA H4100-8e 9005 028f 207d 4054 HRDT TrustHBA H4100-16i 9005 028f 207d 4084 HRDT TrustHBA H4100-16e 9005 028f 207d 4094 HRDT TrustRAID D3152s-8i 9005 028f 207d 4140 HRDT TrustRAID D3154s-8i 9005 028f 207d 4240 Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: David Strahan <david.strahan@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20250423183229.538572-3-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-27scsi: lpfc: Fix lpfc_check_sli_ndlp() handling for GEN_REQUEST64 commandsJustin Tee
[ Upstream commit 05ae6c9c7315d844fbc15afe393f5ba5e5771126 ] In lpfc_check_sli_ndlp(), the get_job_els_rsp64_did remote_id assignment does not apply for GEN_REQUEST64 commands as it only has meaning for a ELS_REQUEST64 command. So, if (iocb->ndlp == ndlp) is false, we could erroneously return the wrong value. Fix by replacing the fallthrough statement with a break statement before the remote_id check. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250425194806.3585-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-19scsi: iscsi: Fix incorrect error path labels for flashnode operationsAlok Tiwari
[ Upstream commit 9b17621366d210ffee83262a8754086ebbde5e55 ] Correct the error handling goto labels used when host lookup fails in various flashnode-related event handlers: - iscsi_new_flashnode() - iscsi_del_flashnode() - iscsi_login_flashnode() - iscsi_logout_flashnode() - iscsi_logout_flashnode_sid() scsi_host_put() is not required when shost is NULL, so jumping to the correct label avoids unnecessary operations. These functions previously jumped to the wrong goto label (put_host), which did not match the intended cleanup logic. Use the correct exit labels (exit_new_fnode, exit_del_fnode, etc.) to ensure proper error handling. Also remove the unused put_host label under iscsi_new_flashnode() as it is no longer needed. No functional changes beyond accurate error path correction. Fixes: c6a4bb2ef596 ("[SCSI] scsi_transport_iscsi: Add flash node mgmt support") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Link: https://lore.kernel.org/r/20250530193012.3312911-1-alok.a.tiwari@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-19scsi: smartpqi: Fix smp_processor_id() call trace for preemptible kernelsYi Zhang
[ Upstream commit 42d033cf4b517e91c187ad2fbd7b30fdc6d2d62c ] Correct kernel call trace when calling smp_processor_id() when called in preemptible kernels by using raw_smp_processor_id(). smp_processor_id() checks to see if preemption is disabled and if not, issue an error message followed by a call to dump_stack(). Brief example of call trace: kernel: check_preemption_disabled: 436 callbacks suppressed kernel: BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u1025:0/2354 kernel: caller is pqi_scsi_queue_command+0x183/0x310 [smartpqi] kernel: CPU: 129 PID: 2354 Comm: kworker/u1025:0 kernel: ... kernel: Workqueue: writeback wb_workfn (flush-253:0) kernel: Call Trace: kernel: <TASK> kernel: dump_stack_lvl+0x34/0x48 kernel: check_preemption_disabled+0xdd/0xe0 kernel: pqi_scsi_queue_command+0x183/0x310 [smartpqi] kernel: ... Fixes: 283dcc1b142e ("scsi: smartpqi: add counter for parity write stream requests") Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Tested-by: Don Brace <don.brace@microchip.com> Signed-off-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20250423183229.538572-5-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-19scsi: hisi_sas: Call I_T_nexus after soft reset for SATA diskYihang Li
[ Upstream commit e4d953ca557e02edd3aed7390043e1b8ad1c9723 ] In commit 21c7e972475e ("scsi: hisi_sas: Disable SATA disk phy for severe I_T nexus reset failure"), if the softreset fails upon certain conditions, the PHY connected to the disk is disabled directly. Manual recovery is required, which is inconvenient for users in actual use. In addition, SATA disks do not support simultaneous connection of multiple hosts. Therefore, when multiple controllers are connected to a SATA disk at the same time, the controller which is connected later failed to issue an ATA softreset to the SATA disk. As a result, the PHY associated with the disk is disabled and cannot be automatically recovered. Now that, we will not focus on the execution result of softreset. No matter whether the execution is successful or not, we will directly carry out I_T_nexus_reset. Fixes: 21c7e972475e ("scsi: hisi_sas: Disable SATA disk phy for severe I_T nexus reset failure") Signed-off-by: Yihang Li <liyihang9@huawei.com> Link: https://lore.kernel.org/r/20250414080845.1220997-4-liyihang9@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-19scsi: qedf: Use designated initializer for struct qed_fcoe_cb_opsKees Cook
[ Upstream commit d8720235d5b5cad86c1f07f65117ef2a96f8bec7 ] Recent fixes to the randstruct GCC plugin allowed it to notice that this structure is entirely function pointers and is therefore subject to randomization, but doing so requires that it always use designated initializers. Explicitly specify the "common" member as being initialized. Silences: drivers/scsi/qedf/qedf_main.c:702:9: error: positional initialization of field in 'struct' declared with 'designated_init' attribute [-Werror=designated-init] 702 | { | ^ Fixes: 035f7f87b729 ("randstruct: Enable Clang support") Link: https://lore.kernel.org/r/20250502224156.work.617-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29scsi: st: Restore some drive settings after resetKai Mäkisara
[ Upstream commit 7081dc75df79696d8322d01821c28e53416c932c ] Some of the allowed operations put the tape into a known position to continue operation assuming only the tape position has changed. But reset sets partition, density and block size to drive default values. These should be restored to the values before reset. Normally the current block size and density are stored by the drive. If the settings have been changed, the changed values have to be saved by the driver across reset. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250120194925.44432-2-Kai.Makisara@kolumbus.fi Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29scsi: lpfc: Free phba irq in lpfc_sli4_enable_msi() when pci_irq_vector() failsJustin Tee
[ Upstream commit f0842902b383982d1f72c490996aa8fc29a7aa0d ] Fix smatch warning regarding missed calls to free_irq(). Free the phba IRQ in the failed pci_irq_vector cases. lpfc_init.c: lpfc_sli4_enable_msi() warn: 'phba->pcidev->irq' from request_irq() not released. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250131000524.163662-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29scsi: lpfc: Ignore ndlp rport mismatch in dev_loss_tmo callbkJustin Tee
[ Upstream commit 23ed62897746f49f195d819ce6edeb1db27d1b72 ] With repeated port swaps between separate fabrics, there can be multiple registrations for fabric well known address 0xfffffe. This can cause ndlp reference confusion due to the usage of a single ndlp ptr that stores the rport object in fc_rport struct private storage during transport registration. Subsequent registrations update the ndlp->rport field with the newer rport, so when transport layer triggers dev_loss_tmo for the earlier registered rport the ndlp->rport private storage is referencing the newer rport instead of the older rport in dev_loss_tmo callbk. Because the older ndlp->rport object is already cleaned up elsewhere in driver code during the time of fabric swap, check that the rport provided in dev_loss_tmo callbk actually matches the rport stored in the LLDD's ndlp->rport field. Otherwise, skip dev_loss_tmo work on a stale rport. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250131000524.163662-4-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29scsi: lpfc: Handle duplicate D_IDs in ndlp search-by D_ID routineJustin Tee
[ Upstream commit 56c3d809b7b450379162d0b8a70bbe71ab8db706 ] After a port swap between separate fabrics, there may be multiple nodes in the vport's fc_nodes list with the same fabric well known address. Duplication is temporary and eventually resolves itself after dev_loss_tmo expires, but nameserver queries may still occur before dev_loss_tmo. This possibly results in returning stale fabric ndlp objects. Fix by adding an nlp_state check to ensure the ndlp search routine returns the correct newer allocated ndlp fabric object. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250131000524.163662-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29scsi: mpt3sas: Send a diag reset if target reset failsShivasharan S
[ Upstream commit 5612d6d51ed2634a033c95de2edec7449409cbb9 ] When an IOCTL times out and driver issues a target reset, if firmware fails the task management elevate the recovery by issuing a diag reset to controller. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Link: https://lore.kernel.org/r/1739410016-27503-5-git-send-email-shivasharan.srikanteshwara@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29scsi: scsi_debug: First fixes for tapesKai Mäkisara
[ Upstream commit f69da85d5d5cc5b7dfb963a6c6c1ac0dd9002341 ] Patch includes the following: - Enable MODE SENSE/SELECT without actual page (to read/write only the Block Descriptor) - Store the density code and block size in the Block Descriptor (only short version for tapes) - Fix REWIND not to use the wrong page filling function Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250213092636.2510-2-Kai.Makisara@kolumbus.fi Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29scsi: mpi3mr: Update timestamp only for supervisor IOCsRanjan Kumar
[ Upstream commit 83a9d30d29f275571f6e8f879f04b2379be7eb6c ] The driver issues the time stamp update command periodically. Even if the command fails with supervisor only IOC Status. Instead check the Non-Supervisor capability bit reported by IOC as part of IOC Facts. Co-developed-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250220142528.20837-3-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29scsi: logging: Fix scsi_logging_level boundsNicolas Bouchinet
[ Upstream commit 2cef5b4472c602e6c5a119aca869d9d4050586f3 ] Bound scsi_logging_level sysctl writings between SYSCTL_ZERO and SYSCTL_INT_MAX. The proc_handler has thus been updated to proc_dointvec_minmax. Signed-off-by: Nicolas Bouchinet <nicolas.bouchinet@ssi.gouv.fr> Link: https://lore.kernel.org/r/20250224095826.16458-5-nicolas.bouchinet@clip-os.org Reviewed-by: Joel Granados <joel.granados@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29scsi: st: ERASE does not change tape locationKai Mäkisara
[ Upstream commit ad77cebf97bd42c93ab4e3bffd09f2b905c1959a ] The SCSI ERASE command erases from the current position onwards. Don't clear the position variables. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250311112516.5548-3-Kai.Makisara@kolumbus.fi Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29scsi: st: Tighten the page format heuristics with MODE SELECTKai Mäkisara
[ Upstream commit 8db816c6f176321e42254badd5c1a8df8bfcfdb4 ] In the days when SCSI-2 was emerging, some drives did claim SCSI-2 but did not correctly implement it. The st driver first tries MODE SELECT with the page format bit set to set the block descriptor. If not successful, the non-page format is tried. The test only tests the sense code and this triggers also from illegal parameter in the parameter list. The test is limited to "old" devices and made more strict to remove false alarms. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250311112516.5548-4-Kai.Makisara@kolumbus.fi Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29scsi: mpi3mr: Add level check to control event loggingRanjan Kumar
[ Upstream commit b0b7ee3b574a72283399b9232f6190be07f220c0 ] Ensure event logs are only generated when the debug logging level MPI3_DEBUG_EVENT is enabled. This prevents unnecessary logging. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250415101546.204018-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-22scsi: sd_zbc: block: Respect bio vector limits for REPORT ZONES bufferSteve Siwinski
commit e8007fad5457ea547ca63bb011fdb03213571c7e upstream. The REPORT ZONES buffer size is currently limited by the HBA's maximum segment count to ensure the buffer can be mapped. However, the block layer further limits the number of iovec entries to 1024 when allocating a bio. To avoid allocation of buffers too large to be mapped, further restrict the maximum buffer size to BIO_MAX_INLINE_VECS. Replace the UIO_MAXIOV symbolic name with the more contextually appropriate BIO_MAX_INLINE_VECS. Fixes: b091ac616846 ("sd_zbc: Fix report zones buffer allocation") Cc: stable@vger.kernel.org Signed-off-by: Steve Siwinski <ssiwinski@atto.com> Link: https://lore.kernel.org/r/20250508200122.243129-1-ssiwinski@atto.com Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-22Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple rangesMichael Kelley
commit 380b75d3078626aadd0817de61f3143f5db6e393 upstream. vmbus_sendpacket_mpb_desc() is currently used only by the storvsc driver and is hardcoded to create a single GPA range. To allow it to also be used by the netvsc driver to create multiple GPA ranges, no longer hardcode as having a single GPA range. Allow the calling driver to specify the rangecount in the supplied descriptor. Update the storvsc driver to reflect this new approach. Cc: <stable@vger.kernel.org> # 6.1.x Signed-off-by: Michael Kelley <mhklinux@outlook.com> Link: https://patch.msgid.link/20250513000604.1396-2-mhklinux@outlook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02scsi: pm80xx: Set phy_attached to zero when device is goneIgor Pylypiv
[ Upstream commit f7b705c238d1483f0a766e2b20010f176e5c0fb7 ] When a fatal error occurs, a phy down event may not be received to set phy->phy_attached to zero. Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Signed-off-by: Salomon Dushimirimana <salomondush@google.com> Link: https://lore.kernel.org/r/20250319230305.3172920-1-salomondush@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02scsi: hisi_sas: Fix I/O errors caused by hardware port ID changesXingui Yang
[ Upstream commit daff37f00c7506ca322ccfce95d342022f06ec58 ] The hw port ID of phy may change when inserting disks in batches, causing the port ID in hisi_sas_port and itct to be inconsistent with the hardware, resulting in I/O errors. The solution is to set the device state to gone to intercept I/O sent to the device, and then execute linkreset to discard and find the disk to re-update its information. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Link: https://lore.kernel.org/r/20250312095135.3048379-3-yangxingui@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02scsi: Improve CDL controlDamien Le Moal
commit 14a3cc755825ef7b34c986aa2786ea815023e9c5 upstream. With ATA devices supporting the CDL feature, using CDL requires that the feature be enabled with a SET FEATURES command. This command is issued as the translated command for the MODE SELECT command issued by scsi_cdl_enable() when the user enables CDL through the device cdl_enable sysfs attribute. However, the implementation of scsi_cdl_enable() always issues a MODE SELECT command for ATA devices when the enable argument is true, even if CDL is already enabled on the device. While this does not cause any issue with using CDL descriptors with read/write commands (the CDL feature will be enabled on the drive), issuing the MODE SELECT command even when the device CDL feature is already enabled will cause a reset of the ATA device CDL statistics log page (as defined in ACS, any CDL enable action must reset the device statistics). Avoid this needless actions (and the implied statistics log page reset) by modifying scsi_cdl_enable() to issue the MODE SELECT command to enable CDL if and only if CDL is not reported as already enabled on the device. And while at it, simplify the initialization of the is_ata boolean variable and move the declaration of the scsi mode data and sense header variables to within the scope of ATA device handling. Fixes: 1b22cfb14142 ("scsi: core: Allow enabling and disabling command duration limits") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Igor Pylypiv <ipylypiv@google.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02scsi: mpi3mr: Fix pending I/O counterRanjan Kumar
commit cdd445258db9919e9dde497a6d5c3477ea7faf4d upstream. Commit 199510e33dea ("scsi: mpi3mr: Update consumer index of reply queues after every 100 replies") introduced a regression with the per-reply queue pending I/O counter which was erroneously decremented, leading to the counter going negative. Drop the incorrect atomic decrement for the pending I/O counter. Fixes: 199510e33dea ("scsi: mpi3mr: Update consumer index of reply queues after every 100 replies") Cc: stable@vger.kernel.org Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250411111419.135485-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02scsi: core: Clear flags for scsi_cmnd that did not completeAnastasia Kovaleva
[ Upstream commit 54bebe46871d4e56e05fcf55c1a37e7efa24e0a8 ] Commands that have not been completed with scsi_done() do not clear the SCMD_INITIALIZED flag and therefore will not be properly reinitialized. Thus, the next time the scsi_cmnd structure is used, the command may fail in scsi_cmd_runtime_exceeded() due to the old jiffies_at_alloc value: kernel: sd 16:0:1:84: [sdts] tag#405 timing out command, waited 720s kernel: sd 16:0:1:84: [sdts] tag#405 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=66636s Clear flags for commands that have not been completed by SCSI. Fixes: 4abafdc4360d ("block: remove the initialize_rq_fn blk_mq_ops method") Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com> Link: https://lore.kernel.org/r/20250324084933.15932-2-a.kovaleva@yadro.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02block: remove the write_hint field from struct requestChristoph Hellwig
[ Upstream commit 61952bb73486fff0f5550bccdf4062d9dd0fb163 ] The write_hint is only used for read/write requests, which must have a bio attached to them. Just use the bio field instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241112170050.1612998-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Stable-dep-of: fc0e982b8a3a ("block: make sure ->nr_integrity_segments is cloned in blk_rq_prep_clone") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-25scsi: megaraid_sas: Block zero-length ATA VPD inquiryChandrakanth Patil
commit aad9945623ab4029ae7789609fb6166c97976c62 upstream. A firmware bug was observed where ATA VPD inquiry commands with a zero-length data payload were not handled and failed with a non-standard status code of 0xf0. Avoid sending ATA VPD inquiry commands without data payload by setting the device no_vpd_size flag to 1. In addition, if the firmware returns a status code of 0xf0, set scsi_cmnd->result to CHECK_CONDITION to facilitate proper error handling. Suggested-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250402193735.5098-1-chandrakanth.patil@broadcom.com Tested-by: Ryan Lahfa <ryan@lahfa.xyz> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-25scsi: smartpqi: Use is_kdump_kernel() to check for kdumpMartin Wilck
[ Upstream commit a2d5a0072235a69749ceb04c1a26dc75df66a31a ] The smartpqi driver checks the reset_devices variable to determine whether special adjustments need to be made for kdump. This has the effect that after a regular kexec reboot, some driver parameters such as max_transfer_size are much lower than usual. More importantly, kexec reboot tests have revealed memory corruption caused by the driver log being written to system memory after a kexec. Fix this by testing is_kdump_kernel() rather than reset_devices where appropriate. Fixes: 058311b72f54 ("scsi: smartpqi: Add fw log to kdump") Signed-off-by: Martin Wilck <mwilck@suse.com> Link: https://lore.kernel.org/r/20250321223319.109250-1-mwilck@suse.com Cc: Randy Wright <rwright@hpe.com> Acked-by: Don Brace <don.brace@microchip.com> Tested-by: Don Brace <don.brace@microchip.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-25scsi: replace blk_mq_pci_map_queues with blk_mq_map_hw_queuesDaniel Wagner
[ Upstream commit bd326a5ad6397ccfc67af862606be107c15a43e6 ] Replace all users of blk_mq_pci_map_queues with the more generic blk_mq_map_hw_queues. This in preparation to retire blk_mq_pci_map_queues. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Daniel Wagner <wagi@kernel.org> Link: https://lore.kernel.org/r/20241202-refactor-blk-affinity-helpers-v6-5-27211e9c2cd5@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Stable-dep-of: a2d5a0072235 ("scsi: smartpqi: Use is_kdump_kernel() to check for kdump") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-25scsi: iscsi: Fix missing scsi_host_put() in error pathMiaoqian Lin
[ Upstream commit 72eea84a1092b50a10eeecfeba4b28ac9f1312ab ] Add goto to ensure scsi_host_put() is called in all error paths of iscsi_set_host_param() function. This fixes a potential memory leak when strlen() check fails. Fixes: ce51c8170084 ("scsi: iscsi: Add strlen() check in iscsi_if_set{_host}_param()") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://lore.kernel.org/r/20250318094344.91776-1-linmq006@gmail.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-25scsi: hisi_sas: Enable force phy when SATA disk directly connectedXingui Yang
[ Upstream commit 8aa580cd92843b60d4d6331f3b0a9e8409bb70eb ] when a SATA disk is directly connected the SAS controller determines the disk to which I/Os are delivered based on the port ID in the DQ entry. When many phys are disconnected and reconnect, the port ID of phys were changed and used by other link, resulting in I/O being sent to incorrect disk. Data inconsistency on the SATA disk may occur during I/O retries using the old port ID. So enable force phy, then force the command to be executed in a certain phy, and if the actual phy ID of the port does not match the phy configured in the command, the chip will stop delivering the I/O to disk. Fixes: ce60689e12dd ("scsi: hisi_sas: add v3 code to send ATA frame") Signed-off-by: Xingui Yang <yangxingui@huawei.com> Link: https://lore.kernel.org/r/20250312095135.3048379-2-yangxingui@huawei.com Reviewed-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-20scsi: st: Fix array overflow in st_setup()Kai Mäkisara
[ Upstream commit a018d1cf990d0c339fe0e29b762ea5dc10567d67 ] Change the array size to follow parms size instead of a fixed value. Reported-by: Chenyuan Yang <chenyuan0y@gmail.com> Closes: https://lore.kernel.org/linux-scsi/CALGdzuoubbra4xKOJcsyThdk5Y1BrAmZs==wbqjbkAgmKS39Aw@mail.gmail.com/ Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250311112516.5548-2-Kai.Makisara@kolumbus.fi Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-20scsi: mpi3mr: Synchronous access b/w reset and tm thread for reply queueRanjan Kumar
[ Upstream commit f195fc060c738d303a21fae146dbf85e1595fb4c ] When the task management thread processes reply queues while the reset thread resets them, the task management thread accesses an invalid queue ID (0xFFFF), set by the reset thread, which points to unallocated memory, causing a crash. Add flag 'io_admin_reset_sync' to synchronize access between the reset, I/O, and admin threads. Before a reset, the reset handler sets this flag to block I/O and admin processing threads. If any thread bypasses the initial check, the reset thread waits up to 10 seconds for processing to finish. If the wait exceeds 10 seconds, the controller is marked as unrecoverable. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250129100850.25430-4-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-20scsi: mpi3mr: Avoid reply queue full conditionRanjan Kumar
[ Upstream commit f08b24d82749117ce779cc66689e8594341130d3 ] To avoid reply queue full condition, update the driver to check IOCFacts capabilities for qfull. Update the operational reply queue's Consumer Index after processing 100 replies. If pending I/Os on a reply queue exceeds a threshold (reply_queue_depth - 200), then return I/O back to OS to retry. Also increase default admin reply queue size to 2K. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250129100850.25430-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22scsi: qla1280: Fix kernel oops when debug level > 2Magnus Lindholm
[ Upstream commit 5233e3235dec3065ccc632729675575dbe3c6b8a ] A null dereference or oops exception will eventually occur when qla1280.c driver is compiled with DEBUG_QLA1280 enabled and ql_debug_level > 2. I think its clear from the code that the intention here is sg_dma_len(s) not length of sg_next(s) when printing the debug info. Signed-off-by: Magnus Lindholm <linmag7@gmail.com> Link: https://lore.kernel.org/r/20250125095033.26188-1-linmag7@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-22scsi: core: Use GFP_NOIO to avoid circular locking dependencyRik van Riel
[ Upstream commit 5363ee9d110e139584c2d92a0b640bc210588506 ] Filesystems can write to disk from page reclaim with __GFP_FS set. Marc found a case where scsi_realloc_sdev_budget_map() ends up in page reclaim with GFP_KERNEL, where it could try to take filesystem locks again, leading to a deadlock. WARNING: possible circular locking dependency detected 6.13.0 #1 Not tainted ------------------------------------------------------ kswapd0/70 is trying to acquire lock: ffff8881025d5d78 (&q->q_usage_counter(io)){++++}-{0:0}, at: blk_mq_submit_bio+0x461/0x6e0 but task is already holding lock: ffffffff81ef5f40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x9f/0x760 The full lockdep splat can be found in Marc's report: https://lkml.org/lkml/2025/1/24/1101 Avoid the potential deadlock by doing the allocation with GFP_NOIO, which prevents both filesystem and block layer recursion. Reported-by: Marc Aurèle La France <tsi@tuyoix.net> Signed-off-by: Rik van Riel <riel@surriel.com> Link: https://lore.kernel.org/r/20250129104525.0ae8421e@fangorn Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-07scsi: core: Clear driver private data when retrying requestYe Bin
[ Upstream commit dce5c4afd035e8090a26e5d776b1682c0e649683 ] After commit 1bad6c4a57ef ("scsi: zero per-cmd private driver data for each MQ I/O"), the xen-scsifront/virtio_scsi/snic drivers all removed code that explicitly zeroed driver-private command data. In combination with commit 464a00c9e0ad ("scsi: core: Kill DRIVER_SENSE"), after virtio_scsi performs a capacity expansion, the first request will return a unit attention to indicate that the capacity has changed. And then the original command is retried. As driver-private command data was not cleared, the request would return UA again and eventually time out and fail. Zero driver-private command data when a request is retried. Fixes: f7de50da1479 ("scsi: xen-scsifront: Remove code that zeroes driver-private command data") Fixes: c2bb87318baa ("scsi: virtio_scsi: Remove code that zeroes driver-private command data") Fixes: c3006a926468 ("scsi: snic: Remove code that zeroes driver-private command data") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20250217021628.2929248-1-yebin@huaweicloud.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-17scsi: core: Do not retry I/Os during depopulationIgor Pylypiv
commit 9ff7c383b8ac0c482a1da7989f703406d78445c6 upstream. Fail I/Os instead of retry to prevent user space processes from being blocked on the I/O completion for several minutes. Retrying I/Os during "depopulation in progress" or "depopulation restore in progress" results in a continuous retry loop until the depopulation completes or until the I/O retry loop is aborted due to a timeout by the scsi_cmd_runtime_exceeced(). Depopulation is slow and can take 24+ hours to complete on 20+ TB HDDs. Most I/Os in the depopulation retry loop end up taking several minutes before returning the failure to user space. Cc: stable@vger.kernel.org # 4.18.x: 2bbeb8d scsi: core: Handle depopulation and restoration in progress Cc: stable@vger.kernel.org # 4.18.x Fixes: e37c7d9a0341 ("scsi: core: sanitize++ in progress") Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Link: https://lore.kernel.org/r/20250131184408.859579-1-ipylypiv@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-17scsi: storvsc: Set correct data length for sending SCSI command without payloadLong Li
commit 87c4b5e8a6b65189abd9ea5010ab308941f964a4 upstream. In StorVSC, payload->range.len is used to indicate if this SCSI command carries payload. This data is allocated as part of the private driver data by the upper layer and may get passed to lower driver uninitialized. For example, the SCSI error handling mid layer may send TEST_UNIT_READY or REQUEST_SENSE while reusing the buffer from a failed command. The private data section may have stale data from the previous command. If the SCSI command doesn't carry payload, the driver may use this value as is for communicating with host, resulting in possible corruption. Fix this by always initializing this value. Fixes: be0cf6ca301c ("scsi: storvsc: Set the tablesize based on the information given by the host") Cc: stable@kernel.org Tested-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Long Li <longli@microsoft.com> Link: https://lore.kernel.org/r/1737601642-7759-1-git-send-email-longli@linuxonhyperv.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-17scsi: qla2xxx: Move FCE Trace buffer allocation to user controlQuinn Tran
commit 841df27d619ee1f5ca6473e15227b39d6136562d upstream. Currently FCE Tracing is enabled to log additional ELS events. Instead, user will enable or disable this feature through debugfs. Modify existing DFS knob to allow user to enable or disable this feature. echo [1 | 0] > /sys/kernel/debug/qla2xxx/qla2xxx_??/fce cat /sys/kernel/debug/qla2xxx/qla2xxx_??/fce Cc: stable@vger.kernel.org Fixes: df613b96077c ("[SCSI] qla2xxx: Add Fibre Channel Event (FCE) tracing support.") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-4-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>